Code Monkey home page Code Monkey logo

Comments (12)

yadayada avatar yadayada commented on August 14, 2024

Strictly speaking, no. But it's possible to launch multiple instances, e.g. to upload different directories.

from acd_cli.

chrisidefix avatar chrisidefix commented on August 14, 2024

Have you tested this (running multiple instances in parallel)?
I wonder if this won't result in conflicts when multiple processes are trying to write to the same database or is there some magic to maintain a consistent database even if two or more processes are trying to write into it at the same time?

PS: this is very high on my wish-list as well, since the ACD Desktop App can't even do that, but then again - what can it do 💥 ?

from acd_cli.

chrisidefix avatar chrisidefix commented on August 14, 2024

To be clear - this is a great addition, but you have to consider that it will only be useful for people, who have large upload bandwidths available. The server caps the connection speed at 10 MB/s (80 Mbit/s) per connection in my experience (e.g. you can open a few browser taps and upload large files in parallel to try this out). Many people may have fast download connections, but not close to these upload rates.

Imagine your ISP limits your upload to 10 MB/s. What use is it really to upload 10 files at the same time with each file uploading at 1 MB/s, when you are just as fast uploading them one after the other with each file uploading at 10 MB/s?

As I said this only becomes interesting when you have multiples of 10 MB/s in upload speed.

If you have a proper glass-fibre connection, you might get there, but anything else, forget about it.
Examples plans:

Example US Providers

V.Fios: you need a plan above the 75/75 Mbit/s option and also reach these speeds for it to make any sense (matching upload & download speeds are nice, but not always common). They seem to offer up to 500/500 Mbit/s, which should be enough for just over 6 parallel connections.
C.Xfinity: only offers asynchronous options with upload speeds around 20 Mbit/s, which is still 4 times slower than the max. bandwidth available.
G.Fibre: here you get 1000 Mbit/s and you could therefore maintain 12.5 connections at the same time.

Alright, I am drifting off-topic, I'm afraid, but my point was simply to show that you need access to a very fast connection for this to be beneficial.

from acd_cli.

chrisidefix avatar chrisidefix commented on August 14, 2024

I wrote a tiny wrapper using import multiprocessing just to test what would happen if acl_cli.py is running parallel in multiple instances. So far it seems to work fine, but I could imagine problems to occur if at some point two data transfers should be completed at exactly the same time...? I haven't seen this happening, yet, but it would be good to test this properly.

from acd_cli.

chrisidefix avatar chrisidefix commented on August 14, 2024

This is turning into a bit of a lonesome conversation 👽 but since I am testing this, I thought I should share my findings.

After good results uploading 2 files in parallel, I realised that the performance will also heavily depend on the disk read-speeds. If you are syncing files from an external USB 2.0 hard drive for example, you'll probably not going to exceed 30 MB/s (depending on your drive), which means you will only want to read 2-3 files in parallel off that disk.

I am currently testing 12 files in parallel, but peak transfer rates are stalling at 30 MB/s, even though the connection would support much more than that. I guess I should try with a faster drive to get better results.

Also noteworthy - CPU times are quite reasonable. Every process uses about 5% to 6% of a single CPU core, which will leave you with plenty of leg-room if you are running on a somewhat modern multi-core CPU.

from acd_cli.

yadayada avatar yadayada commented on August 14, 2024

@chrisidefix If there are two "overlapping" writes to the sqlite database, the instance that tries to write later should crash because it cannot acquire a lock.
There may also be background hashing going on which lowers net disk transfer speeds. This applies to files larger than 500MB.

However, there also may be unnecessary auth token refreshes.

PS: My maximum upload speed is about 8Mbit/s, so this isn't very high on my priority list.

from acd_cli.

chrisidefix avatar chrisidefix commented on August 14, 2024

It may become interesting again, when you either:

(1) want to upload many small files
(2) start downloading files

Download speed for many folks could very well be above the limit, but you are right, it's a nice to have feature that may be too stressful to implement when thinking about its actual benefit.

from acd_cli.

dansku avatar dansku commented on August 14, 2024

How does the concurrent download works?

from acd_cli.

yadayada avatar yadayada commented on August 14, 2024

There is now an -x argument. E.g. acd_cli dl -x 4 /my_remote_folder for 4 simultaneous connections. Same thing for uploads.

from acd_cli.

chrisidefix avatar chrisidefix commented on August 14, 2024

@yadayada Thanks for implementing this. This commit shows a significant performance improvement - I tested this feature and can confirm that even for slower upload speeds it is well worth using parallel uploads. It allows a much more continuous use of the available bandwidth. The only downside is an elevated use of CPU resources. Previously, when I ran 4 processes in parallel, CPU use would be at about 20% (~ 5% per process). Now for -x 4 CPU use is at 80% (~ 20% per thread) 😞 but at least there should be no issues with DB locks.

All in all, you could consider making -x 2 the default - it should make upload/download faster in any case.

UPDATE: I am continuously uploading large files (at least 2 GB in size) from an external USB drive running acd_cli.py for several hours. Python 3.4 is maxing out one of my cores at 100% by now (running under on OS X 10.10.3).

from acd_cli.

procmail avatar procmail commented on August 14, 2024

I am also finding this a great feature, especially when uploading directories with many small files. With small files, uploading serially won't maximize the bandwidth usage.

However, I'm not seeing a big jump in CPU usage with -x 8. Python's CPU usage is currently 20%-50%, hovering at 20+% most of the time.

from acd_cli.

yadayada avatar yadayada commented on August 14, 2024

I'm keeping one thread as the default for now, because I'm currently not sure if it's safe to insert into sqlite from different threads under all conditions. To my regret, sqlite3 seems to be safe for multiple processes, but not necessarily thread-safe.

from acd_cli.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.