Code Monkey home page Code Monkey logo

django-import-export-celery's Introduction

image

django-import-export-celery: process slow django imports and exports in celery

django-import-export-celery helps you process long running imports and exports in celery.

Basic installation

  1. Set up celery to work with your project.
  2. Add 'import_export_celery' to your INSTALLED_APPS settings variable
  3. Add 'author.middlewares.AuthorDefaultBackendMiddleware' to your MIDDLEWARE_CLASSES
  4. Configure the location of your celery module setup

    IMPORT_EXPORT_CELERY_INIT_MODULE = "projectname.celery"

Setting up imports with celery

A fully configured example project can be found in the example directory of this repository.

  1. Perform the basic setup procedure described above.
  2. Configure the IMPORT_EXPORT_CELERY_MODELS variable.

    def resource():  # Optional
        from myapp.models import WinnerResource
        return WinnerResource
    
    
    IMPORT_EXPORT_CELERY_MODELS = {
        "Winner": {
            'app_label': 'winners',
            'model_name': 'Winner',
            'resource': resource,  # Optional
        }
    }

    The available parameters are app_label, model_name, and resource. 'resource' should be a function which returns a django-import-export Resource.

  3. Done

By default a dry run of the import is initiated when the import object is created. To instead import the file immediately without a dry-run set the IMPORT_DRY_RUN_FIRST_TIME to False

IMPORT_DRY_RUN_FIRST_TIME = False

Performing an import

You will find an example django application that uses django-import-export-celery for importing data. There are instructions for running the example application in the example directory's README file. Once you have it running, you can perform an import with the following steps.

  1. Navigate to the example applications admin page:

    image

  2. Navigate to the ImportJobs table:

    image

  3. Create a new import job. There is an example import CSV file in the example/example-data directory. Select that file. Select csv as the file format. We'll be importing to the Winner's model table.

    image

  4. Select "Save and continue editing" to save the import job and refresh until you see that a "Summary of changes made by this import" file has been created.

    image

  5. You can view the summary if you want. Your import has NOT BEEN PERFORMED YET!

    image

  6. Return to the import-jobs table, select the import job we just created, and select the "Perform import" action from the actions drop down.

    image

  7. In a short time, your imported Winner object should show up in your Winners table.

    image

Setting up exports

As with imports, a fully configured example project can be found in the example directory.

  1. Add a export_resource_classes classmethod to the model you want to export.
    @classmethod
    def export_resource_classes(cls):
        return {
            'winners': ('Winners resource', WinnersResource),
            'winners_all_caps': ('Winners with all caps column resource', WinnersWithAllCapsResource),
        }

    This should return a dictionary of tuples. The keys should be unique unchanging strings, the tuples should consist of a resource and a human friendly description of that resource.

  2. Add the create_export_job_action to the model's ModelAdmin.
    from django.contrib import admin
    from import_export_celery.admin_actions import create_export_job_action
    
    from . import models
    
    
    @admin.register(models.Winner)
    class WinnerAdmin(admin.ModelAdmin):
        list_display = (
            'name',
        )
    
        actions = (
            create_export_job_action,
        )
  3. To customise export queryset you need to add get_export_queryset to the ModelResource.
    class WinnersResource(ModelResource):
        class Meta:
            model = Winner
    
        def get_export_queryset(self):
            """To customise the queryset of the model resource with annotation override"""
            return self.Meta.model.objects.annotate(device_type=Subquery(FCMDevice.objects.filter(
                    user=OuterRef("pk")).values("type")[:1])
  4. Done!

Performing exports with celery

  1. Perform the basic setup procedure described in the first section.
  2. Open up the object list for your model in django admin, select the objects you wish to export, and select the Export with celery admin action.
  3. Select the file format and resource you want to use to export the data.
  4. Save the model
  5. You will receive an email when the export is done, click on the link in the email
  6. Click on the link near the bottom of the page titled Exported file.

Excluding export file formats in the admin site

All available file formats to export are taken from the Tablib project.

To exclude or disable file formats from the admin site, configure IMPORT_EXPORT_CELERY_EXCLUDED_FORMATS django settings variable. This variable is a list of format strings written in lower case.

IMPORT_EXPORT_CELERY_EXCLUDED_FORMATS = ["csv", "xls"]

Customizing File Storage Backend

Define a custom storage backend by adding the IMPORT_EXPORT_CELERY_STORAGE to your Django settings STORAGES definition. For instance:

STORAGES = {
    "IMPORT_EXPORT_CELERY_STORAGE": {
        "BACKEND": "storages.backends.s3boto3.S3Boto3Storage",
    },
}

Customizing Task Time Limits

By default, there is no time limit on celery import/export tasks. This can be customized by setting the following variables in your Django settings file.

# set import time limits (in seconds)
IMPORT_EXPORT_CELERY_IMPORT_SOFT_TIME_LIMIT = 300  # 5 minutes
IMPORT_EXPORT_CELERY_IMPORT_HARD_TIME_LIMIT = 360  # 6 minutes

# set export time limits (in seconds)
IMPORT_EXPORT_CELERY_EXPORT_SOFT_TIME_LIMIT = 300  # 5 minutes
IMPORT_EXPORT_CELERY_EXPORT_HARD_TIME_LIMIT = 360  # 6 minutes

Customizing email template for export job completion email

By default this is the subject and template used to send the email

Subject: 'Django: Export job completed'
Email template: 'email/export_job_completion.html'

The default email template can be found here

The default email subject and template can be customized by overriding these values from django settings:-

EXPORT_JOB_COMPLETION_MAIL_SUBJECT="Your custom subject"
EXPORT_JOB_COMPLETION_MAIL_TEMPLATE="path_to_folder/your_custom_template.html"

The email template will get some context variables that you can use to customize your template.

{
    export_job: The current instance of ExportJob model
    app_label: export_job.app_label
    model: export_job.model
    link: A link to go to the export_job instance on django admin
}

For developers of this library

You can enter a preconfigured dev environment by first running make and then launching ./develop.sh to get into a docker compose environment packed with redis, celery, postgres and everything you need to run and test django-import-export-celery.

Before submitting a PR please run flake8 and (in the examples directory) python3 manange.py test.

Please note, that you need to restart celery for changes to propogate to the workers. Do this with docker-compose down celery, docker-compose up celery.

Commercial support

Commercial support is provided by gradesta s.r.o.

Credits

django-import-export-celery was developed by the Czech non-profit auto*mat z.s..

django-import-export-celery's People

Contributors

ahmedalzabidi avatar alaanour94 avatar aparakian avatar charleshan avatar denkneb avatar dependabot[bot] avatar frabyn avatar jacklinke avatar jaspreet-singh-1032 avatar jmsoler7 avatar marksweb avatar petrdlouhy avatar petrkudy avatar platzhersh avatar q0w avatar rmaceissoft avatar samupl avatar timthelion avatar urtzai avatar zachbellay avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

django-import-export-celery's Issues

Celery 5 support

Hi there,

I've noticed that this package doesn't support celery 5 since the celery.task module has been deprecated which is used in the tasks module here.

I am happy to take a look to see if there is an option to fix this in a PR but I thought I'd raise it here to see if you have any thoughts.

Thanks,
Matt

Add changelist template

Currently the way import/export works is with an action, from the action dropdown.

The problem here is that sometimes you want to have custom logic that is not based on selecting elements from a model list. In my case I need an export action that exports hundreds of thousand of rows into csv - and the user needs to make a selection in a dropdown but not pass a list of pks. Instead I am overriding the get_queryset method of the resource in question.

Missing migration

I integrated this on one of the projects I'm working on, seems fine I noticed the migration issue but even if I do the makemigrations the migration file doesn't apply properly.

Screen Shot 2021-11-22 at 9 43 06 PM

The generated migration is under this path /usr/local/lib/python3.8/site-packages/import_export_celery/migrations/0008_auto_20211122_1340.py but trying to navigate to this direct nothing is generated...

I also know that a similar issue was created but I can't seem to find proper documentation about migration modules. Any ideas?

Importation job doesn't start

Hi all, I have been trying to integrate this library into my Django app with celery 4.4 . Everything seem to be going smooth with these configurations

# CELERY STUFF
    REDIS_HOST = 'localhost'
    REDIS_PORT = '6379'
    BROKER_URL = 'redis://' + REDIS_HOST + ':' + REDIS_PORT + '/0'
    BROKER_TRANSPORT_OPTIONS = {'visibility_timeout': 3600}
    CELERY_RESULT_BACKEND = 'redis://' + REDIS_HOST + ':' + REDIS_PORT + '/0'
    CELERY_TASK_SERIALIZER = "json"
    CELERY_RESULT_SERIALIZER = "json"
    REDIS_URL = os.environ.get('REDIS_URL', 'redis://redis')
    IMPORT_EXPORT_CELERY_INIT_MODULE = "beaver.celery"
    IMPORT_EXPORT_CELERY_MODELS = {
        "Price": {
            'app_label': 'prices',
            'model_name': 'Price',
        }
    }

redis-cli is perfectly responding to PING. But I'm stuck with nothing.
image

Could you suggest where I can look for the problem ? Thanks.

Job finished but no objects created

Job status info: 5/5 Import job finished

I see thousands of rows in the import summary with the change_type new but nothing was added to the database.

update project setup to make collaboration easier

Hi There,

So i forked this repo to make some updates and I encountered some difficulties with the setup- I would like to suggest the following updates:

  1. migrate to using poetry instead of setup.py, and then add the dependencies and dev dependencies at the root level.
  2. migrate to using github actions, works with poetry: https://github.com/snok/install-poetry
  3. add pre-commit
  4. add contribution guidelines
  5. add unit tests - currently there is really nothing

I would be glad to help with some of these if you are interested.

Troubles exporting huge table

Hello, I've faced troubles while exporting the table with 180.000+ rows.
Here is a traceback:

[2022-06-23 15:06:51,905: ERROR/ForkPoolWorker-17] Task import_export_celery.tasks.run_export_job[9ed87b13-927c-4acc-aad3-fa9d3a1243ea] raised unexpected: MemoryError()
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/celery/app/trace.py", line 451, in trace_task
    R = retval = fun(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/celery/app/trace.py", line 734, in __protected_call__
    return self.run(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/import_export_celery/tasks.py", line 225, in run_export_job
    serialized = format.export_data(data)
  File "/usr/local/lib/python3.9/site-packages/import_export/formats/base_formats.py", line 88, in export_data
    return dataset.export(self.get_title(), **kwargs)
  File "/usr/local/lib/python3.9/site-packages/tablib/core.py", line 427, in export
    return fmt.export_set(self, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/tablib/formats/_csv.py", line 33, in export_set
    return stream.getvalue()
MemoryError   
[2022-06-23 15:06:54,726: ERROR/MainProcess] Pool callback raised exception: MemoryError('Process got: ')
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/billiard/pool.py", line 1796, in safe_apply_callback
    fun(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/celery/worker/request.py", line 730, in on_success
    return self.on_failure(retval, return_ok=True)
  File "/usr/local/lib/python3.9/site-packages/celery/worker/request.py", line 545, in on_failure
    raise MemoryError(f'Process got: {exc}')
MemoryError: Process got: 

It crashes at the very end of exporting:
ะธะทะพะฑั€ะฐะถะตะฝะธะต

I have no idea what to do. Here is what command 'top' showing at the end of the process while crashing. It seems my server has enough memory:
ะธะทะพะฑั€ะฐะถะตะฝะธะต
ะธะทะพะฑั€ะฐะถะตะฝะธะต

Concurrency is not specified. Celery takes 16 as default (as server has).

For example, model with 40.000 rows exported successfully and much faster.

Any ideas, please?

Maintainers wanted

Hello,
it seems this project is getting lasting attention. It deserves more maintainers. If you'd like to step up to the plate, just comment below. :)
Thanks

Tags missing

PyPI has v1.1.3 , however there is no tag here on GitHub for that release.

schedule to Import data

Could we have the function to schedule to Import data to Django? For example, import data and update report for every 30 minutes?
Thanks,

TypeError: NoneType takes no arguments

It happends to me randomly when starting export job. The job runs then when clicking on it (Run with Celery) once again in ExportJobs in Django Admin. Any ideas?

2021-02-23T06:17:09.585791+00:00 app[worker.1]: [2021-02-22 22:17:09,585: ERROR/ForkPoolWorker-7] Task import_export_celery.tasks.run_export_job[6bbc3607-6179-4ab6-b3e1-a23bb82eb4c4] raised unexpected: TypeError('NoneType takes no arguments')
2021-02-23T06:17:09.585792+00:00 app[worker.1]: Traceback (most recent call last):
2021-02-23T06:17:09.585793+00:00 app[worker.1]:   File "/usr/local/lib/python3.8/site-packages/celery/app/trace.py", line 412, in trace_task
2021-02-23T06:17:09.585793+00:00 app[worker.1]:     R = retval = fun(*args, **kwargs)
2021-02-23T06:17:09.585794+00:00 app[worker.1]:   File "/usr/local/lib/python3.8/site-packages/celery/app/trace.py", line 704, in __protected_call__
2021-02-23T06:17:09.585794+00:00 app[worker.1]:     return self.run(*args, **kwargs)
2021-02-23T06:17:09.585794+00:00 app[worker.1]:   File "/usr/local/lib/python3.8/site-packages/sentry_sdk/integrations/celery.py", line 171, in _inner
2021-02-23T06:17:09.585794+00:00 app[worker.1]:     reraise(*exc_info)
2021-02-23T06:17:09.585795+00:00 app[worker.1]:   File "/usr/local/lib/python3.8/site-packages/sentry_sdk/_compat.py", line 57, in reraise
2021-02-23T06:17:09.585795+00:00 app[worker.1]:     raise value
2021-02-23T06:17:09.585796+00:00 app[worker.1]:   File "/usr/local/lib/python3.8/site-packages/sentry_sdk/integrations/celery.py", line 166, in _inner
2021-02-23T06:17:09.585796+00:00 app[worker.1]:     return f(*args, **kwargs)
2021-02-23T06:17:09.585796+00:00 app[worker.1]:   File "/usr/local/lib/python3.8/site-packages/import_export_celery/tasks.py", line 199, in run_export_job
2021-02-23T06:17:09.585797+00:00 app[worker.1]:     class Resource(resource_class):
2021-02-23T06:17:09.585797+00:00 app[worker.1]: TypeError: NoneType takes no arguments

Celery memory leak with large files

I am facing a blocking issue using django-import-export and django-import-export-celery.

Context

I need to import large CSV files (about ~250k lines) to my database.
I work on my local environment and I have a few other's available (dev, staging, prod).

Issue

When I perform the import on my local environment, the import is quite long but it eventually works great.
But each time I try to perform an import on dev environment I get this error from celery:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/billiard/pool.py", line 1267, in mark_as_worker_lost
    human_status(exitcode)),
billiard.exceptions.WorkerLostError: Worker exited prematurely: signal 9 (SIGKILL).

It seems to be a memory usage issue but I can't figure out why it occurs as I tried many to change settings (celery max-tasks-per-child option, celery max-memory-per-child option, DEBUG Django setting).

Also, I tried by increasing my instance memory up to 13Gb (1Gb before) but still, the error occurs.

Questions

Do you have any insight that I can use to solve my issue ?
Is a 250k lines file too much ?
Are my celery settings bad ?

Override Import Job from.

There is import job form as in django_import_export library we override import form but in this we change Importjobform layout but cannot get the values of field.

How to show progress of import in dry_run=True?

if row_number % 100 == 0 or row_number == 1: change_job_status( import_job, "import", "3/5 Importing row %s/%s" % (row_number, len(dataset)), dry_run, )

def before_import_row(self, row, **kwargs):

This does not work for dry run. I think it is due to atomic transaction so the database is not updated when doing a dry run . It work only when dry_run = False and using_transactions = False

Issue with dry_run and skip_diff

I'm seeing a dry run, using skip_diff or skip_html_diff resulting in the error;

Import error 'NoneType' object is not iterable

This comes about here, in _run_import_job;

    if dry_run:
        summary = "<html>"
        summary += "<head>"
        summary += '<meta charset="utf-8">'
        summary += "</head>"
        summary += "<body>"
        summary += '<table  border="1">'  # TODO refactor the existing template so we can use it for this

        if not result.invalid_rows:
            cols = lambda row: "</td><td>".join([field for field in row.diff])

If skip_diff or skip_html_diff are enabled, the row has no attribute diff.

"Models aren't loaded yet"

Following the instructions on how to setup exports, I am running into a "Models aren't loaded yet." error.

I added the Resource in the models.py file, but I have more than one app with models.py files.

The error is caused from the ResourceModel class defined at the very end of the models.py file.

This is the classmethod I added to the model:

class Ride(models.Model):
    @classmethod
    def export_resource_classes(cls):
        return {
            'all rides': ('Ride resources', RideResource),
        }

this is the RideResource at the end of the file:

class RideResource(ModelResource):
    class Meta:
        model = Ride

How to define task queue for the django-import-export-celery task

In my django application I'm using a task queue. Whenever, I try to execute model export, it doesn't triggering the django-import-export-celery task. May be I need to define task queue for the shared_task from this library. I wanted to ask, if there is any way to override the queue settings so that the task will be executed on the right celelry task queue? @auto-mat

Add more efficient way to export all records in a very large table

Say I have a table with 2.5 million rows - the way this library works is that it sends a list of primary keys of all objects to export in the job-creation POST request. For very large sets of objects, uploading megabytes of primary keys can be quite slow or even run up against POST request body-size limits or other misc edge cases.

So, I think it could be useful to efficiently handle the special case of exporting everything.

Perhaps this looks like one of the django-admin gray action buttons w/ a confirmation page which creates the job.

Changes likely required:

  • possible modification to the export job model to represent "all"
  • adding a confirmation view page for exporting all
  • adding a django admin mixin to add the grey action button
  • docs update

Happy to send a PR if you think this is a good addition.

you forget to do command makemigrations

You doesn't have this migration file

from django.db import migrations, models

class Migration(migrations.Migration):

dependencies = [
    ("import_export_celery", "0007_auto_20210210_1831"),
]

operations = [
    migrations.AlterField(
        model_name="exportjob",
        name="id",
        field=models.BigAutoField(auto_created=True, primary_key=True, serialize=False, verbose_name="ID"),
    ),
    migrations.AlterField(
        model_name="importjob",
        name="id",
        field=models.BigAutoField(auto_created=True, primary_key=True, serialize=False, verbose_name="ID"),
    ),
]

Importing fails or raises error for model that contains primary key which is not id

I was trying to import a model resource whose primary key is not the id. And when I'm trying to import it from the celery import admin section but it's failing and raising errors. But the same file is able to create/update rows using django-import-export import action.

This is my model resource

class RoleDetailResource(resources.ModelResource):

    class Meta:
        model = RoleDetail
        chunk_size = 5000
        import_id_fields = ("owner", )
        exclude = ("id", )

And I'm getting the following errors while importing,

Error with dry run:
with_dry_run

Error without dry run
without_dry_run

But import was successful with django-import-export action and row can be seen on the list:
import_with_django_import_export

@timthelion

Does this package allow for automatic task queue execution?

This package seems interesting, however, I have concerns about automatic task queue handling. Let's say I upload a dataset and it creates a celery task/job for the same thing. I want it to be executed automatically without me having to select it and execute it from the admin panel as shown in the screenshots.

How to turn off dry option for importing models data

I was looking for some option so that on form save for importing model data, dry run could be silenced or turned off. But so far didn't find any other way except re-running it using Perform Import admin action.

So, I would suggest to control this settings from django settings configuration. So, by default it'll keep it turned on. But if it finds settings from django then it'll use that option. For instance, settings.IMPORT_DRY_RUN could be used like below for the post_save signals for import job

@receiver(post_save, sender=ImportJob)
def importjob_post_save(sender, instance, **kwargs):
    dry_run=True
    if settings.IMPORT_DRY_RUN:
       dry_run=settings.IMPORT_DRY_RUN
    if not instance.processing_initiated:
        instance.processing_initiated = timezone.now()
        instance.save()
        transaction.on_commit(lambda: run_import_job.delay(instance.pk, dry_run=dry_run))

@timthelion

[Dry run] Error reading file

After import job succeeded, I just have an error

Error reading file: [Errno 2] No such file or directory: 'project/django-import-export-celery-import-jobs/file.csv'

The file is well saved in ImportJob instance. This means in 'project/media/django-import-export-celery-import-jobs/file.csv' and the import celery doesn't find it (from MEDIA_ROOT).

Unable to export large amount of data

Hi, I am not able to export a large amount of data(10 millions of rows).
the system gets a timeout issue and takes a massive amount of memory.
JSON list of pks to export: Instead of ids, can we save the query.

Creating async import_jobs via code

Is there a command or function that can be used to create an import job ?

Use case : data from single CSV is being used to populated multiple related tables. It would be a cleaner interface if the user can update all the related tables with a single upload.

In order to do so, we've created resources for the related tables in the after_import of the parent resource. The issue is the imports for the related tables are locked to the parent import and cannot be spun into separate threads. Is there a way to create an celery-import job with code?

Unable to see add button under export screen

Hi, I am not able to see the add button in the django-export screen. Do i need to give any extra permission?
after modifying "has_add_permission" function(return true) . i was able to see the add button. but still would not see module information.

Missing middleware

Hi, the docs need to be updated with an explanation that the AuthorMiddleware is required, I'm getting this error:

Error "author.middlewares.AuthorDefaultBackendMiddleware" is not found in MIDDLEWARE_CLASSES nor MIDDLEWARE. It is required to use AuthorDefaultBackend

Also, since this is apparently a part of this package as well - maybe it should be clarified that the author middleware is a requirement here.

How do you load a resource in IMPORT_EXPORT_CELERY_MODELS?

IMPORT_EXPORT_CELERY_MODELS = {
    "Winner": {'app_label': 'winners', 'model_name': 'Winner'}
}

"The available parameters are app_label, model_name, and resource. 'resource' should be a function which returns a django-import-export Resource."

This is what the documentation says but we can't add a resource before loading the app. Also, adding a string doesn't work.

Move the `queryset` field to a FileField

would be better as a filefield that took a CSV file to allow one to manually create a list of items to be exported. Even better would be if there could be columns, normally pk would be set, but it could be, for example email and a list of email adresses. Or even more interestingly, it could be like a single row paid and the only value would be True and it would export all Paid users....

TypeError: argument of type 'NoneType' is not iterable

Hi,

I am trying to run the project file inside examples folder, before that i want to run the celery workers, so i tried to run the celery and i getting the below error.

I think i am getting from task.py file. can you please help.

TypeError: argument of type 'NoneType' is not iterable

I am using Python 3.7x
django 2.1.8

Regards,
Yogesh

Error handling

If import ends with error => dry import is finished ... summary is still created but empty... only head is there.. its strange.

i know error is in details of import ... but still.. who open that?!
TODO:

  1. do not render summary
  2. print error to summary
  3. print that dry import ended with error.... (i think that worked in older version?)

Catch TimeLimitExceeded exception

Job status info: [Dry run] 4/5 Generating import summary
The job status info was stuck on this step and when I checked the logs, I found TimeLimitExceeded exception.

2020-06-23T16:53:24.028548585Z app[worker.2]: [2020-06-23 09:53:24,027: ERROR/MainProcess] Task handler raised error: TimeLimitExceeded(300)
2020-06-23T16:53:24.028599784Z app[worker.2]: Traceback (most recent call last):
2020-06-23T16:53:24.028605055Z app[worker.2]:   File "/app/.heroku/python/lib/python3.7/site-packages/billiard/pool.py", line 684, in on_hard_timeout
2020-06-23T16:53:24.028609276Z app[worker.2]:     raise TimeLimitExceeded(job._timeout)
2020-06-23T16:53:24.028625581Z app[worker.2]: billiard.exceptions.TimeLimitExceeded: TimeLimitExceeded(300,)
2020-06-23T16:53:24.036035220Z app[worker.2]: [2020-06-23 09:53:24,035: ERROR/MainProcess] Hard time limit (300s) exceeded for import_export_celery.tasks.run_import_job[45422346-1572-4e69-ad7e-77a488cbf2ee]

A quick change to the config should fix the issue but I think we should catch the exception and add it to the job status info.

No module named 'project'

Hi Team,

This library looks great, I would love to use it for my project! Unfortunately, when I add 'import_export_celery' to my INSTALLED_APPS, I get a ModuleNotFound error due to the expectation that the folder containing celery.py is called 'project' (from __init__.py).

Renaming this folder would require a huge amount of effort. Any way that we could use a setting to control the top-level directory name?

Also, in case I'm interpreting wrong the stack trace is below. I am running Django 3.0 with Gunicorn and Python 3.8. The project is containerized with two seperate Celery worker containers.

Stack Trace:

backend              | Traceback (most recent call last):
backend              |   File "/opt/venv/lib/python3.8/site-packages/gunicorn/arbiter.py", line 583, in spawn_worker
backend              |     worker.init_process()
backend              |   File "/opt/venv/lib/python3.8/site-packages/gunicorn/workers/base.py", line 129, in init_process
backend              |     self.load_wsgi()
backend              |   File "/opt/venv/lib/python3.8/site-packages/gunicorn/workers/base.py", line 138, in load_wsgi
backend              |     self.wsgi = self.app.wsgi()
backend              |   File "/opt/venv/lib/python3.8/site-packages/gunicorn/app/base.py", line 67, in wsgi
backend              |     self.callable = self.load()
backend              |   File "/opt/venv/lib/python3.8/site-packages/gunicorn/app/wsgiapp.py", line 52, in load
backend              |     return self.load_wsgiapp()
backend              |   File "/opt/venv/lib/python3.8/site-packages/gunicorn/app/wsgiapp.py", line 41, in load_wsgiapp
backend              |     return util.import_app(self.app_uri)
backend              |   File "/opt/venv/lib/python3.8/site-packages/gunicorn/util.py", line 350, in import_app
backend              |     __import__(module)
backend              |   File "/home/default/delta_mvp/wsgi.py", line 11, in <module>
backend              |     application = get_wsgi_application()
backend              |   File "/opt/venv/lib/python3.8/site-packages/django/core/wsgi.py", line 12, in get_wsgi_application
backend              |     django.setup(set_prefix=False)
backend              |   File "/opt/venv/lib/python3.8/site-packages/django/__init__.py", line 24, in setup
backend              |     apps.populate(settings.INSTALLED_APPS)
backend              |   File "/opt/venv/lib/python3.8/site-packages/django/apps/registry.py", line 91, in populate
backend              |     app_config = AppConfig.create(entry)
backend              |   File "/opt/venv/lib/python3.8/site-packages/django/apps/config.py", line 90, in create
backend              |     module = import_module(entry)
backend              |   File "/usr/local/lib/python3.8/importlib/__init__.py", line 127, in import_module
backend              |     return _bootstrap._gcd_import(name[level:], package, level)
backend              |   File "/opt/venv/lib/python3.8/site-packages/import_export_celery/__init__.py", line 1, in <module>
backend              |     from project.celery import app as celery_app
backend              | ModuleNotFoundError: No module named 'project'

Getting django.core.exceptions.ImproperlyConfigured error in Celery

Below is the complete error log in the Celery Server.

[2019-08-04 17:46:26,578: INFO/MainProcess] Received task: import_export_celery.tasks.run_import_job[cb18c31e-fee4-4072-be25-d46d07f0798b]
[2019-08-04 17:46:26,588: DEBUG/MainProcess] TaskPool: Apply <function _fast_trace_task at 0x7f329f349400> (args:('import_export_celery.tasks.run_import_job', 'cb18c31e-fee4-4072-be25-d46d07f0798b', {'lang': 'py', 'task': 'import_export_celery.tasks.run_import_job', 'id': 'cb18c31e-fee4-4072-be25-d46d07f0798b', 'shadow': None, 'eta': None, 'expires': None, 'group': None, 'retries': 0, 'timelimit': [None, None], 'root_id': 'cb18c31e-fee4-4072-be25-d46d07f0798b', 'parent_id': None, 'argsrepr': '(2,)', 'kwargsrepr': "{'dry_run': True}", 'origin': 'gen1874@ubuntu-bionic', 'reply_to': '63856816-dad3-32bc-a9b9-c392f1929017', 'correlation_id': 'cb18c31e-fee4-4072-be25-d46d07f0798b', 'delivery_info': {'exchange': '', 'routing_key': 'celery', 'priority': 0, 'redelivered': None}}, b'[[2], {"dry_run": true}, {"callbacks": null, "errbacks": null, "chain": null, "chord": null}]', 'application/json', 'utf-8') kwargs:{})
[2019-08-04 17:46:26,598: INFO/ForkPoolWorker-4] import_export_celery.tasks.run_import_job[cb18c31e-fee4-4072-be25-d46d07f0798b]: Importing 2 dry-run True
[2019-08-04 17:46:26,631: DEBUG/MainProcess] Task accepted: import_export_celery.tasks.run_import_job[cb18c31e-fee4-4072-be25-d46d07f0798b] pid:2050
[2019-08-04 17:46:26,726: DEBUG/ForkPoolWorker-4] params: (2,)
[2019-08-04 17:46:26,727: DEBUG/ForkPoolWorker-4]
sql_command: SELECT "import_export_celery_importjob"."id", "import_export_celery_importjob"."file", "import_export_celery_importjob"."processing_initiated", "import_export_celery_importjob"."imported", "import_export_celery_importjob"."format", "import_export_celery_importjob"."change_summary", "import_export_celery_importjob"."errors", "import_export_celery_importjob"."model", "import_export_celery_importjob"."author_id", "import_export_celery_importjob"."updated_by_id" FROM "import_export_celery_importjob" WHERE "import_export_celery_importjob"."id" = %(0)s
[2019-08-04 17:46:26,866: DEBUG/ForkPoolWorker-4] Find query: {'filter': {'id': {'$eq': 2}}, 'projection': ['id', 'file', 'processing_initiated', 'imported', 'format', 'change_summary', 'errors', 'model', 'author_id', 'updated_by_id']}
[2019-08-04 17:46:26,882: DEBUG/ForkPoolWorker-4] import_export_celery.tasks.run_import_job[cb18c31e-fee4-4072-be25-d46d07f0798b]: None
[2019-08-04 17:46:27,134: ERROR/ForkPoolWorker-4] Task import_export_celery.tasks.run_import_job[cb18c31e-fee4-4072-be25-d46d07f0798b] raised unexpected: ImproperlyConfigured()
Traceback (most recent call last):
File "/home/vagrant/env/lib/python3.6/site-packages/celery/app/trace.py", line 385, in trace_task
R = retval = fun(*args, **kwargs)
File "/home/vagrant/env/lib/python3.6/site-packages/celery/app/trace.py", line 648, in protected_call
return self.run(*args, **kwargs)
File "../import_export_celery/tasks.py", line 61, in run_import_job
result = resource.import_data(dataset, dry_run=dry_run)
File "/home/vagrant/env/lib/python3.6/site-packages/import_export/resources.py", line 573, in import_data
raise ImproperlyConfiguredNote:

django.core.exceptions.ImproperlyConfigured

its comming from the task.py file from the below mentioned code.

result = resource.import_data(dataset, dry_run=dry_run)

Note: I am using Vagrant not Dockers and i am using mongoDB

  • logged in to Admin
  • Uploaded the CSV file in examples folder
  • Clicked on Save and Continue editting
  • In Celery terminal i got the above error.

Thank you and Regards,
Yogesh

Migrations pending

Installed aps
django-import-export-celery==1.1.3
issue:
E AssertionError: Your models have changes that are not yet reflected in a migration. You should add them now. Relevant app(s): dict_keys(['import_export_celery'])

It looks like there are pending migrations on the latest release, and then local migrations are failing on tests, I can't run makemigrations to do a work around because I don't have permission on the folder where the code of the library is located.

Race condition calling `run_import_job` in `importjob_post_save` upon creation

I have this error popping 90% of the time using version 1.1.4 :

[2021-07-27 12:27:02,269: ERROR/ForkPoolWorker-8] Task import_export_celery.tasks.run_import_job raised unexpected: DoesNotExist('ImportJob matching query does not exist.',)
Traceback (most recent call last):
  File "/celery/app/trace.py", line 412, in trace_task
    R = retval = fun(*args, **kwargs)
  File "/celery/app/trace.py", line 704, in __protected_call__
    return self.run(*args, **kwargs)
  File "/import_export_celery/tasks.py", line 187, in run_import_job
    import_job = models.ImportJob.objects.get(pk=pk)
  File "/django/db/models/manager.py", line 85, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/django/db/models/query.py", line 380, in get
    self.model._meta.object_name
import_export_celery.models.importjob.DoesNotExist: ImportJob matching query does not exist.

When it's called at creation here: here

I don't have the issue anymore if I replace this line with :

run_import_job.apply_async((instance.pk,), kwargs={'dry_run': True}, countdown=1)

Renaming the app and hide import

Hello!

I think the AppConfig setup is missing, because there is no defined app name.
image

So how to rename the app for the backend users?
And how to remove the import admin page in the menu?

django.core.exceptions.FieldDoesNotExist: ExportJob has no field named 'job_status_info'

During file export, I am getting below error.
ExportJob has no field named 'job_status_info'

/lib/python3.6/site-packages/Django-2.2.3-py3.6.egg/django/db/models/options.py", line 567, in get_field
raise FieldDoesNotExist("%s has no field named '%s'" % (self.object_name, field_name))
django.core.exceptions.FieldDoesNotExist: ExportJob has no field named 'job_status_info'

During handling of the above exception, another exception occurred:

How to provide customised querysets for the export job

Currently, I'm trying to export a model which might contain a lot of annotated data from different models related to the user model.
But when I export I got only the Model's data and annotated data is missing. But this can be found while using export functionality of django-import-export export action.
I was wondering if there is any way to customise the queryset for the targeted export model with celery export
@timthelion

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.