Code Monkey home page Code Monkey logo

phockup's People

Contributors

3onyc avatar amandel avatar breadcat avatar daedren avatar danrue avatar dependabot[bot] avatar emdioh avatar francoisvdv avatar greatquux avatar inverse avatar ivandokov avatar joshuacrewe avatar matthewgodding avatar mdujava avatar moritzfl avatar neopar avatar pabera avatar pauloup avatar qlyoung avatar rob-miller avatar roykrikke avatar seberm avatar senden9 avatar solomspd avatar stchris avatar tantonescu avatar trinitonesounds avatar unapproachable avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

phockup's Issues

Creating parsed_date object fails for unknown reason. NoneType object is assigned instead.

I experienced the following error twice while organizing a ~100 GB photo library.
Unfortunately, I deleted the images so I cannot provide you with test samples.

It looks like that creating the parsed_date object failed for some reason.

pictures/IMG_2226.JPGTraceback (most recent call last):
File "/usr/local/bin/phockup", line 88, in
main(sys.argv[1:])
File "/usr/local/bin/phockup", line 82, in main
timestamp=timestamp
File "/usr/local/Cellar/phockup/1.5.6/src/phockup.py", line 36, in init
self.walk_directory()
File "/usr/local/Cellar/phockup/1.5.6/src/phockup.py", line 67, in walk_directory
self.process_file(file)
File "/usr/local/Cellar/phockup/1.5.6/src/phockup.py", line 144, in process_file
output, target_file_name, target_file_path = self.get_file_name_and_path(file)
File "/usr/local/Cellar/phockup/1.5.6/src/phockup.py", line 184, in get_file_name_and_path
date = Date(file).from_exif(exif_data, self.timestamp, self.date_regex)
File "/usr/local/Cellar/phockup/1.5.6/src/date.py", line 48, in from_exif
if parsed_date.get("date") is not None:
AttributeError: 'NoneType' object has no attribute 'get'

Apart from that the tool worked like a charm, so thanks for sharing!

1.5.8 no longer works on Windows

It appears that the file path was having issues and the program was throwing errors such as file not found, etc.

I had to fall back to 1.5.7.

Graphical user interface

Creating a cross platform GUI will be the best thing that can happen for this software and I think this could be the major feature for v2 milestone.

Since I have zero experience with coding GUIs for desktop apps and especially with Pyton any help will be appreciated.

I made a research some time ago about which library to use in order to accomplish full cross platform GUI solution but haven't found any easy to use solution. Any suggestions are welcome!

Choose the EXIF date field

Would be nice to pass an argument in the command-line allowing to choose which EXIF date field should be used as the final image date. In the example below, phockup is using the CreateDate field, but the year of this field is 2002. For this case DateTimeOriginal is the correct field.

exiftool -time:all -mimetype -j IMG_20140513_190138258.jpg


[{
  "SourceFile": "IMG_20140513_190138258.jpg",
  "FileModifyDate": "2014:05:13 19:01:38-03:00",
  "FileAccessDate": "2018:12:21 15:02:32-02:00",
  "FileInodeChangeDate": "2018:05:27 20:36:06-03:00",
  "ModifyDate": "2014:05:13 19:01:38",
  "DateTimeOriginal": "2014:05:13 19:01:38",
  "CreateDate": "2002:12:08 12:00:00",
  "MIMEType": "image/jpeg"
}]


Fail to handle files with illegal characters

Issue: phockup fail to process files with names like: Photo "2".jpg

Illegal characters, like double quotes, are not escaped in the exiftool call, so files with illegal filenames fail to get exif information and go to the unknown folder, at least on Linux.

Example:

~> phockup input output
/bin/sh: 1: Syntax error: EOF in backquote substitution
input/!#$%&'"*+-.^_`|~:.jpg => output/unknown/!#$%&'"*+-.^_`|~:.jpg

Use Symlinks

Hello.

Can we have an option to choose between hardlink or symlink? Now, the -l make hardlink, which is fine, but would be better, if we can have for example -ls if we want to make symlinks.

p.s.: This is not an issue, but I could not make a new pull request :(, so that's why is this here ;)

Thanks

Allow to move files instead of copy

Currently the entire process is copying files from one location to another.
Adding a flag to move files instead of copy will be useful if you are working with big collection of files and the available space is not enough to double the files.

"Original Filenames" option should leave name fully untouched & should not change filenames to lowercase

line 186 of src/phockup.py contains
target_file_name = self.get_file_name(file, date).lower()

When using the -o flag to preserve original filenames, the .lower() function should not be run on the file.

Filenames should be fully untouched when using the -o flag.

Tested and confirmed that removing the .lower() function from this line preserves the original uppercase and lowercase formatting of the filename. However, this should probably only occur when the -o flag is passed, not in all cases as the removal of the .lower() function would do.

Problem using regex

Hello!

I'm having some issues using the option --regex. I'm pretty new to regex and I'm probably the problem here, but I would really appreciate some help.

The regular expression I'm using is:

"img[_-]?(?P<year>\d{4})(?P<month>\d{2})(?P<day>\d{2})[_-]?"

And some of the file names store in the folder are:

img-20161026-wa0011.jpg
img-20161026-wa0012.jpg
img-20161026-wa0013.jpg
img-20161101-wa0001.jpg

I also tried to use a regex tester like this one and it seems to confirm that my expression is correct.

The command used to run phockup is:
./phockup.py ~/Escritorio/notscanned/ ~/Escritorio/phockup-test/ -m -d YYYY-MM-DD -r="img[_-]?(?P<year>\d{4})(?P<month>\d{2})(?P<day>\d{2})[_-]?"

And with all this phockup only moves the pictures to the unknown folder:

/home/alejandro/Escritorio/notscanned/img-20161101-wa0001.jpg => /home/alejandro/Escritorio/phockup-test/unknown/img-20161101-wa0001.jpg

What am I doing wrong? Any help would be appreciated.

Thanks!

Don't rely on exiftool

There are Python libraries that can deal with Exif data, like ExifRead.

This could be used instead of the manual invocation of exiftool and associated process handling, and would get rid of the external dependency. Win-win.

Folder-related errors

Hi Ivan,

I get some strange errors related to the input and output folders:

$ phockup . /outputdir
Input directory "." does not exist

Of course it exists, "." is always there. Perhaps the problem is that the current directory name has a space in it? Trying again with the full pathname to the current directory:

$ phockup ~/Dropbox/Camera\ Uploads /outputdir
Input directory "/home/jos/Dropbox/Camera Uploads" does not exist

Another attempt:

$ phockup . /mnt/tower/Media/Foto\'s/
Output directory "/mnt/tower/Media/Foto's/" does not exist, creating now
Traceback (most recent call last):
  File "/snap/phockup/27/lib/phockup/phockup.py", line 263, in <module>
    main(sys.argv[1:])
  File "/snap/phockup/27/lib/phockup/phockup.py", line 28, in main
    os.makedirs(outputdir)
  File "/snap/phockup/27/usr/lib/python3.5/os.py", line 231, in makedirs
    makedirs(head, mode, exist_ok)
  File "/snap/phockup/27/usr/lib/python3.5/os.py", line 231, in makedirs
    makedirs(head, mode, exist_ok)
  File "/snap/phockup/27/usr/lib/python3.5/os.py", line 241, in makedirs
    mkdir(name, mode)
OSError: [Errno 30] Read-only file system: '/mnt/tower'

The output directory is already there, it shouldn't have to be created. Also /mnt/tower is not a read-only file system; it is writable, as I can confirm right after this command.

What can I do?

Jos

copies all files

newbiw question ; when I run , phockup ./dropbox ./photos all the files in dropbox are copied

for example

ls -l photos/unknown
-rw-r--r-- 1 root root  944914432 Oct 11 23:49 '2015 h2.zip'
-rw-rw-r-- 1 root root  519722441 Aug  1 17:13 'All Mail'
-rw-rw-r-- 1 root root 1859857537 Jan 14  2018  d-g.zip
-rwxr-xr-x 1 root root        112 Jan 12  2018  flat.sh
-rw-rw-r-- 1 root root  931395873 Jan 15  2018  h-i.zip
-rw-rw-r-- 1 root root      60614 Aug  1 17:10  inbox
-rw-rw-r-- 1 root root     271360 Aug  1 17:21 'Personal Folders.pst'

how do I make this program only check / search images and videos ?

Traceback: Error in phockup.py?

Hello,

I'm unsure if this is an issue relate to your program, but when running the snap version of phockup I get this cryptic traceback:

$user@thinkpad ~ $ phockup ~/Dropbox/Kamera-Uploads/ $backupDrive/Pictures/ --date YYYY/MM_M

~/Dropbox/Kamera-Uploads/2017-06-04 11.03.04.jpgTraceback (most recent call last):
  File "/snap/phockup/67/lib/phockup/phockup.py", line 343, in <module>
    main(sys.argv[1:])
  File "/snap/phockup/67/lib/phockup/phockup.py", line 63, in main
    handle_file(os.path.join(root, filename), outputdir, dir_format, move_files)
  File "/snap/phockup/67/lib/phockup/phockup.py", line 235, in handle_file
    if sha256_checksum(source_file) == sha256_checksum(target_file):
  File "/snap/phockup/67/lib/phockup/phockup.py", line 281, in sha256_checksum
    with open(filename, 'rb') as f:
PermissionError: [Errno 13] Permission denied: '/media/julius/Backup/Pictures/2017/06_June/20170604-110304687693.jpg'

Do you have an Idea how to solve this issue?
Best

Option -o doesn't preserve upercase

Using the option --original-filenames, my files still get renamed from "IMG_2018..." to "img_2018..."

To fix it, I had to edit the file src/phockup.py and replace the line 186 from:

target_file_name = self.get_file_name(file, date).lower()

to:

if self.original_filenames:
    target_file_name = self.get_file_name(file, date)
else:
    target_file_name = self.get_file_name(file, date).lower()

When used --move delete empty source directories

When the process strategy is changed to move (-m|--move) the script should check if the source directory for any remaining file and if there are none the directory should be deleted.

This could be done for each subdirectory after each file move process (could be expensive operation) or at the end of the whole process.

Sort by filename too

I had a directory with sorted images and I added some others on top of it, so I ran phockup to get the new ones sorted. In the process of sorting the first ones, the tool moved to unknown instead of looking a the filename or other attributes.

/home/wilmar/Pictures/sorted/2017/04/20170415-173104.jpg => sorted_images/unknown/20170415-173104.jpg /home/wilmar/Pictures/sorted/2017/04/20170415-173242.jpg => sorted_images/unknown/20170415-173242.jpg /home/wilmar/Pictures/sorted/2017/04/20170415-173521906057.jpg => sorted_images/2017/04/15/20170415-173521906057.jpg /home/wilmar/Pictures/sorted/2017/04/20170415-173525697619.jpg => sorted_images/2017/04/15/20170415-173525697619.jpg /home/wilmar/Pictures/sorted/2017/04/20170415-173527947603.jpg => sorted_images/2017/04/15/20170415-173527947603.jpg /home/wilmar/Pictures/sorted/2017/04/20170415-173535280797.jpg => sorted_images/2017/04/15/20170415-173535280797.jpg /home/wilmar/Pictures/sorted/2017/04/20170415-173538364085.jpg => sorted_images/2017/04/15/20170415-173538364085.jpg /home/wilmar/Pictures/sorted/2017/04/20170415-173542739040.jpg => sorted_images/2017/04/15/20170415-173542739040.jpg /home/wilmar/Pictures/sorted/2017/04/20170415-173554072463.jpg => sorted_images/2017/04/15/20170415-173554072463.jpg /home/wilmar/Pictures/sorted/2017/04/20170415-173556030465.jpg => sorted_images/2017/04/15/20170415-173556030465.jpg /home/wilmar/Pictures/sorted/2017/04/20170415-173849.jpg => sorted_images/unknown/20170415-173849.jpg /home/wilmar/Pictures/sorted/2017/04/20170415-174019666761-2.jpg => sorted_images/2017/04/15/20170415-174019666761.jpg /home/wilmar/Pictures/sorted/2017/04/20170415-174019666761.jpg => sorted_images/2017/04/15/20170415-174019666761-2.jpg /home/wilmar/Pictures/sorted/2017/04/20170415-203447.mp4 => sorted_images/2017/04/15/20170415-203447.mp4

Add option for custom regex for date in filename

There are cases when some cameras do not include the correct EXIF data, but they use filenames with date (and time). There is a guessing code for such filenames (IMG_20160915_123456.jpg / IMG-20160915-123456.jpg) but it is for a single more generic filename and any other different filenames are ignored. Adding an option to pass regex for date and time guessing will be good.

Add automated tests

Currently the code does not have any kind of tests but it should.

I've created a separate repository for these tests because they will have some large dummy files and we do not want to include them in the final software.

XMP file

Would be possible to implement the tools for xmp file (generated for example with darktable) ?

Thanks !

write stdout to log file

Write STDOUT to log file.

This will be very much useful if any human mistake happened during file copy or move operation.

Skipping files without EXIF date

is it possible to skip files that have not date data and would land in the unknown folder?

I have many files that would not be identifiable upon the moment that they land in an unknown directory and the context of the original path would help me deciding which occasion and date I want to use for a fix.

When target file exists it is overwritten

If the target file exists it should not overwrite it right away.
sha256 checksum should be compared and then if it matches the file should be skipped. Otherwise a new file name should be selected with a suffix. It should also change the xmp file name if exists.

Classify by year/month only

It would be nice if we could choose to only classify the pictures in year/month
For me having all the photos of the same month in one directory is sufficient

NFS mounted directory, path check fails.

Hello.

It appears that the application can't seem to work on NFS mounted shares. I've tried to take a look into the code and replicating the issue just running within python itself (the path check) but it works there fine, which is odd.
I've also tried for kicks to change the permissions to the deadly 777. This also didn't seem to affect how it works.
Here are some pictures of what's been tried, and the original command.
image
image
image
The mount is as follows:
server.domain.com:/mnt/JoNas_Vol_1/ on /JoNAS type nfs (rw,relatime,vers=3,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=ip.ad.dr.ess,mountver=3,mountpor=990,mountproto=udp,locl_locl=none,addr=ip.ad.dre.ss)

I did also try using . as the initial path, and it seemed to work which was also odd, but I wasn't able to get the secondary path to work.

If you require any more information, let me know.

Thanks!

When exiftool returns error the process is stopped

Pictures/DCIM/136___05/IMG_2395.JPG~RF8ea1f.TMPTraceback (most recent call last):
  File "/usr/local/bin/phockup", line 243, in <module>
    main(sys.argv[1:])
  File "/usr/local/bin/phockup", line 51, in main
    handle_file(file, outputdir)
  File "/usr/local/bin/phockup", line 178, in handle_file
    exif_data = exif(file)
  File "/usr/local/bin/phockup", line 66, in exif
    data = check_output(['exiftool', file]).decode('UTF-8').strip().split("\\n")[0].split("\n")
  File "/usr/lib/python3.5/subprocess.py", line 626, in check_output
    **kwargs).stdout
  File "/usr/lib/python3.5/subprocess.py", line 708, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['exiftool', 'Pictures/DCIM/136___05/IMG_2395.JPG~RF8ea1f.TMP']' returned non-zero exit status 1

Progress bar

It will be awesome to have a progress bar at the bottom of the process.
With the current way of printing the process data it could be a tricky one.

KeyError: 'MIMEType

Hi, got this error with phockup v1.5.3 while running the following command, don't know what's the issue:

$ ./phockup.py ~/data.photos/PhotoDVD23 ~/data.photos/PhotoDVD1_new -m -d YYYY.MM

/home/superuser/data.photos/PhotoDVD23/DSC_5469.JPG => /home/superuser/data.photos/PhotoDVD1_new/2011.07/20110709-15204290.jpg
/home/superuser/data.photos/PhotoDVD23/DSC_5470.JPG => /home/superuser/data.photos/PhotoDVD1_new/2011.07/20110709-15210110.jpg
/home/superuser/data.photos/PhotoDVD23/DSC_5471.NDFTraceback (most recent call last):
File "./phockup.py", line 75, in
main(sys.argv[1:])
File "./phockup.py", line 69, in main
date_regex=date_regex
File "/home/superuser/data.git/phockup.git/src/phockup.py", line 34, in init
self.walk_directory()
File "/home/superuser/data.git/phockup.git/src/phockup.py", line 65, in walk_directory
self.process_file(file)
File "/home/superuser/data.git/phockup.git/src/phockup.py", line 137, in process_file
output, target_file_name, target_file_path = self.get_file_name_and_path(file)
File "/home/superuser/data.git/phockup.git/src/phockup.py", line 176, in get_file_name_and_path
if exif_data and self.is_image_or_video(exif_data['MIMEType']):
KeyError: 'MIMEType'

Ubuntu 16.04: Permission denied?

I installed PhockUp in Ubuntu 16.04 using sudo snap install phockup. Then I ran it like this:

$ phockup ownCloud/InstantUpload/ InstantUploadSorted/
/snap/phockup/25/command-phockup.wrapper: 6: exec: /snap/phockup/25/phockup.sh: Permission denied

Is it intentional? Do I have to run it with root permission? (I haven't tested if it works with root permission for obvious reasons.) I find it a bit surprising if arranging and copying files in my home directory requires root access. Could you please give a hint, or is this a bug?

Refactor to Python class

The code is using functions and there are a few with a lot of arguments in order to pass some global settings to the function that actually needs the argument. Such example is

def handle_file(source_file, outputdir, dir_format, move_files, date_regex=None):

here.

By refactoring to a class those global arguments could be class properties set by the constructor.

Be able to change the file name format.

By default phockup renames the files completely using a date-based pattern. We can just keep the original name, but nothing in between. It might be useful to keep the original part of the file name and add that date prefix. Also it might be useful to change the date format, eg. to avoid the long "timestamp".

Regex on filenames with optional hour info

A bit of an extension of #55

If we have a regex that accepts hour information, yet has it as optional, phockup will crash.

Assuming the following custom regex:
(?P<day>\d{2})\.(?P<month>\d{2})\.(?P<year>\d{4})[_-]?((?P<hour>\d{2})\.(?P<minute>\d{2})\.(?P<second>\d{2}))?
and the following filename: IMG_27.01.2015.jpg

In date.py
match_dir = matches.groupdict()
will still create keys for hour information, but their values will be None.

match_dir = dict([a, int(x)] for a, x in match_dir.items())
will consequently crash when casting NoneType to Int

Dry run option

I have a feature request, but I'd be willing to contribute an implementation if you think this is a good idea: a --dry-run option, which would just print out the log output, but actually not move any files or change any content on disk.

How would you feel about this?

EDITED for clarity: not move any files

Allow user to add to ignore_files list

Allow the user to append filenames to the ignore_files list.

ignored_files = (".DS_Store", "Thumbs.db")

Should allow the use of a flag, such as -i, to accept both:

  1. Multiple filenames within the command line (as in -i filename1.txt file2.docx or -i filename1.txt -i file2.docx)
  2. Text file with exclusion list on multiple lines (as in -i /path/to/phockupIgnoreList.txt with contents of phockupIgnoreList.txt being:

    filename1.txt
    file2.docx

(The flag may need to differ for these two, such as using -i for in-line filename exclusions and -n for passing an ignore file).

This ignore feature should allow extension-level exclusions, such as *.txt as well as folder level exclusions (particularly for hidden folders), such as ./.hidden/

Multithreading support

I just started using phockup. I tried to process a folder of 25k images and it took about 1h to complete, with no much processing power being used. So I wondered how better it could be with multi threading support.

After some coding, I managed to do that. Here are the time output with a test folder of 36 images and 4 threads:

Original:

~> time phockup test test2 --move 
6.43user 0.52system 0:06.77elapsed 102%CPU (0avgtext+0avgdata 17544maxresident)k

Multithread:

~> time phockup test test2 --move --threads 4
9.28user 0.69system 0:03.23elapsed 309%CPU (0avgtext+0avgdata 17596maxresident)k

Results: Elapsed time from 7s to 3s and CPU usage of 3x.

I had never played with threads in python before this, so I'm sure my code is not the best way to do it. I'm just making a point that multithread can improve the performance of phockup, since it relies a lot on exiftool, and it ends up being a performance bottle neck.

What I did: In the walk_directory function I split the files list into subsets, and each subset goes into a thread. The --threads (or -x) parameter defines how many threads are going to be used. I also adjusted the print commands to prevent a race condition on stdout. The patches with the code changes I did are attached.

phockup.py
phockup.py-multithread-patch.txt

src/help.py
src-help.py-multithread-patch.txt

src/phockup.py
src-phockup.py-multithread-patch.txt

Thank you for creating and sharing phockup.

Remove list of file extension and check all files with exiftool

At the moment the code is looking for NEF and JPG files.
It can try to read all files' exif data and act accordingly.
The matched photos will be sorted, the matched videos will be also sorted, unmatched files will be copied to unknown directory.

exiftool returns error for unknown type:
Error: Unknown file type

New software name

The name of the software is coming from the combination of the words "photos" and "backup" as this was the main idea behind Phockup but as you pronounce it you get quite mixed feeling about what it does. Initially the software was made for my own usage and I though a funny name won't do any harm, but since we got an article in OMG Ubuntu I think we need a new name. Something that is not age restricted :)

Please give your suggestion for a new name of the software.

PS: I am planing to create a GUI for easier usage and I think the rename of the software will be done when the GUI is completed (version 2).

ModuleNotFoundError: No module named 'src'

I installed the package on a arch config via the yaourt command.
I tried to launch the command phockup in a bash terminal and got:

ModuleNotFoundError: No module named 'src'

It seems that the python module does not exist:

ls -al /usr/share/phockup/
total 24
drwxr-xr-x   2 root root  4096 30 déc 11:11 .
drwxr-xr-x 262 root root 12288 30 déc 10:59 ..
-rwxr-xr-x   1 root root  2241 30 déc 11:10 phockup.py

How to install correctly the repo on arch?
Thanks for your help !

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.