Code Monkey home page Code Monkey logo

lovasoa / dezoomify Goto Github PK

View Code? Open in Web Editor NEW
633.0 24.0 70.0 4.4 MB

Dezoomify is a web application to download zoomable images from museum websites, image galleries, and map viewers. Many different zoomable image technologies are supported.

Home Page: https://dezoomify.ophir.dev

License: GNU General Public License v2.0

JavaScript 84.76% PHP 1.74% CSS 3.51% HTML 10.00%
zoomable-images zoomify downloader image hack museum dezoomify iiif google-art openseadragon

dezoomify's Introduction

Dezoomify

Dezoomify cover image

Download zoomable images

Dezoomify extracts full high-resolution images from online zoomable image interfaces. It works with several zoomable image tools, from several different websites (see the list below). It takes as input the URL of a a zoomable image and gives as output an image that you can download (by right-clicking on it, and choosing Save Image as...).

In order to find the URL of the zoomable that dezoomify requires, you can install the dezoomify browser extension. Alternatively, you can also try to find the zoomable image URL yourself.

Try it

If you are not interested in the source code and just want to assemble tiles of (dezoomify) a zoomify-powered image, go there : unzoomify an image

Troubleshooting

FAQ

If you have problems while downloading an image, then read the FAQ.

Reporting issues

Your bug reports and feature requests are welcome! Please go the the Github issue page of the project, and explain your problem. Please be clear, and give the URL of the page containing the image dezoomify failed to process.

Supported zoomable image formats

The following formats are supported by dezoomify:

The most prominant supported websites include :

  • Arts & Culture (artsandculture.google.com)
  • Gallica (gallica.bnf.fr)
  • The British Library (bl.uk)
  • National Gallery of Art (nga.gov)
  • Hungaricana (hungaricana.hu)
  • National Library of Australia (nla.gov.au)
  • National Library of Israel (nli.org.il)
  • National Galleries Of Scotland (nationalgalleries.org)
  • National Library of Scotland (nls.uk)
  • Harvard Library (library.harvard.edu)
  • heidICON, Heidelberg University (heidicon.ub.uni-heidelberg.de)
  • Geographicus (geographicus.com)
  • Archivio di Stato di Trieste (archiviodistatotrieste.it)

Dezoomify also has a generic dezoomer. If the zoomable image format is simple enough, you just have to enter a pattern of tile URL, and dezoomify will be able to work with it.

Screenshots

dezoomify downloading an image

Video tutorial

Video tutorial for dezzomify

Programming Languages

The aim of the script is to do as much as possible in Javascript (with the HTML5 <canvas> tag), and only the network-related stuffs on the server side. The only little piece of server-side code that remains in the code is just a proxy, used to circumvent the same-origin policy. We implemented this code both in Javascript (node-app/proxy.js) and PHP (proxy.php), so you just need to have either one on your server to run dezoomify.

Wikimedia

This script on wikimedia : Zoomify in the help about zoomable Images on wikimedia

Local development

You can run the script locally, using php:

# Install the dependencies
sudo apt install php-cli

# Run the script
php -S localhost:3000

Then open http://localhost:3000/ in your browser.

GPL

Copyright © 2011-2017 Lovasoa

This file is part of Dezoomify.

Dezoomify is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

Dezoomify is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with Dezoomify; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA*/

dezoomify's People

Contributors

1-byte avatar adunning avatar agmmnn avatar avindra avatar bangank36 avatar dependabot[bot] avatar haroenv avatar lovasoa avatar noah-dollar avatar vdk avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dezoomify's Issues

Add Support for Issuu

Hello Lovasoa
Hope you're doing fine
of course first of all thanks for your valuable help!
I'd like to download pages or books from the website issuu and with ordinary downloaders I just get large thumbnails quality
Can dezoomify help?
Giving you an example of a magazine I'd like to download

http://issuu.com/baranes/docs/les_belles_heures_du_duc_de_berry__

Thanks for any help you might give

See you
HoremHeb

Support for "Biblioteka zabytków polskiego piśmiennictwa średniowiecznego"

This is actually a CD: https://sklep.ijp-pan.krakow.pl/images/zabytki.gif.

It never worked for me on Linux although the content is just a HTML with zoomify, now it does't work even on Windows. I would be happy to extract the images from the CD. My earlier attempt to use some dezoomify software failed for various reasons.

Unfortunately I cannot provide you with the complete CD as some parts of it are copyrighted (you can still buy it for 9,99 PLN = 2,5 Euro at https://sklep.ijp-pan.krakow.pl/product_info.php?products_id=30).

Here is a listing of a directory containing a document:

dr-x------ 3 jsbien jsbien   4096 Aug 13  2006 001r
dr-x------ 3 jsbien jsbien   4096 Aug 13  2006 001v
dr-x------ 3 jsbien jsbien   4096 Aug 13  2006 002r
dr-x------ 3 jsbien jsbien   4096 Aug 13  2006 002v
dr-x------ 3 jsbien jsbien   4096 Aug 13  2006 003r
dr-x------ 3 jsbien jsbien   4096 Aug 13  2006 003v
dr-x------ 3 jsbien jsbien   4096 Aug 13  2006 004r
dr-x------ 3 jsbien jsbien   4096 Aug 13  2006 004v
-r-------- 1 jsbien jsbien    781 Jun 18  2006 ar.htm
-r-------- 1 jsbien jsbien    779 Jun 18  2006 av.htm
-r-------- 1 jsbien jsbien   1043 Jun 18  2006 bottom.html
-r-------- 1 jsbien jsbien    781 Jun 18  2006 br.htm
-r-------- 1 jsbien jsbien    829 Jun 18  2006 btm_2.gif
-r-------- 1 jsbien jsbien   4792 Jan 29  2007 btm_2.html
-r-------- 1 jsbien jsbien    779 Jun 18  2006 bv.htm
-r-------- 1 jsbien jsbien    781 Jun 18  2006 cr.htm
-r-------- 1 jsbien jsbien    779 Jun 18  2006 cv.htm
-r-------- 1 jsbien jsbien    781 Jun 18  2006 dr.htm
-r-------- 1 jsbien jsbien    779 Jun 18  2006 dv.htm
-r-------- 1 jsbien jsbien    848 Jan 29  2007 index2.html
-r-------- 1 jsbien jsbien    720 Nov 17  2006 index3.html
-r-------- 1 jsbien jsbien    848 Jan 29  2007 index.html
-r-------- 1 jsbien jsbien 339758 Nov  8  2006 KazSKSkr.pdf
-r-------- 1 jsbien jsbien 929341 Jan 15  2007 KazSKTL.pdf
-r-------- 1 jsbien jsbien 614020 Jan 15  2007 KazSKTS.pdf
-r-------- 1 jsbien jsbien   2231 Apr 26  2006 _l.jpg
-r-------- 1 jsbien jsbien    657 Apr 26  2006 _m2.jpg
-r-------- 1 jsbien jsbien   1893 Apr 26  2006 main_2.html
-r-------- 1 jsbien jsbien    676 Apr 26  2006 _m.jpg
-r-------- 1 jsbien jsbien   1240 Jun 18  2006 panel2.gif
-r-------- 1 jsbien jsbien   1241 Jun 18  2006 panel.gif
-r-------- 1 jsbien jsbien   2329 Aug 10  2006 _p.jpg
-r-------- 1 jsbien jsbien   6144 Nov 21  2006 Thumbs.db
-r-------- 1 jsbien jsbien   1502 Jan 22  2007 top.html
-r-------- 1 jsbien jsbien  23302 Nov 20  2003 zoomifyViewer.swf

and this is a sample page (ar.htm):

<HTML><style type="text/css">
<!--
body {
    margin-left: 0px;
    margin-top: 0px;
    margin-right: 0px;
    margin-bottom: 0px;
}
-->
</style><BODY><DIV ALIGN="center">
<OBJECT CLASSID="clsid:D27CDB6E-AE6D-11cf-96B8-444553540000" CODEBASE="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0" WIDTH="100%" HEIGHT="100%">
                <PARAM NAME="FlashVars" VALUE="zoomifyImagePath=001r/">
                <PARAM NAME="zoomifyZoom" VALUE="200">
                <PARAM NAME="src" VALUE="zoomifyViewer.swf">


                <EMBED  SRC="zoomifyViewer.swf" PLUGINSPAGE="http://www.macromedia.com/shockwave/download/index.cgi?P1_Prod_Version=ShockwaveFlash"  WIDTH="100%" HEIGHT="100%"></EMBED>
              </OBJECT>
  </DIV>
</BODY></HTML>

I will appreciate your advice how to proceede with the extraction task.

Best regards

Janusz

P.S. Thank you very much for Polona support!

First version of your link http://jk.g6.cz/dezoomify.html

Hello Lovasoa
I am having always the same issue with this old link of urs. When I right click and select save image mozilla opens a save page link which is blank.. many times it does like this and only sometimes the window opens correctly and I can save the page... Any solution to this issue?
I am downloading from this manuscript: http://www1.arkhenum.fr/images/bm_verdun_ms/ms/OEB/MS0107/images/B555456201_MS0107_042r
Your old link worked for that.. but I am now getting quite always a blank page when I try to save in Mozilla..
You know I didn't want to bother you with another issue.. this link seemed to work just fine. In the new dezoomify page this manuscript doesn't seem to be downloadable so I was using the old link but I began getting this issue... :-|
Thanks!
Yours sincerely
HoremHeb

Dezoomify images on Masterfile.com?

Hi, thanks for all you hard work on this very useful tool. I have been using Firefox and a "save cached images" plugin to download individual tiles from zoomable images for years, and stitching them together in Photoshop to make wallpapers for my computer. Dezoomify looks like it might be a massive time-saver, however I can't find the XML file in the network traffic as your FAQ recommends, and none of the other URLs I've tried in vain hope have worked either. It seems to use the SeaDragon PFF format but they look to have done some extra locking down because it's just not working :(

Would you mind taking a look and giving me an idea of whether it'd be possible to get Dezoomify working, for example on any image from the search at this URL? http://www.masterfile.com/em/search/#session=1464166286938&id=1464166286652&color=&colour_key=0&format=hvsp&imgtype=P&releases=&keyImage=&keyword=jungle+waterfall&license=RM&mode=search&sort=hatter

Thanks so much!

Add support for bspe-p-pub.paris.fr

Site description

Portail des bibliothèques spécialisées de la ville de Paris

Zoomifier type

zoomify

URL example exposing the problem

http://bspe-p-pub.paris.fr/MDBGED/zoomify-BFS.aspx?edid=8727&edfindex=0&count=1&referer=http%3a%2f%2fbspe-p-pub.paris.fr%2fMDBGED%2fEDFileDetail-BFS.aspx%3fedid%3d8727&hash=2ead02da52ccdbb882ec136edba4b72b

Problem description

The server raises an error when loading a page without the ASP.NET_SessionId cookie set.
See #27

Proposed solution

This will require several step:

  • Modify the PHP proxy to allow setting custom headers.
  • Add a way to retrieve headers from an URL.
  • Load pages from this site in two steps:

Artsy

I can't download images from Artsy using http://ophir.lojkine.free.fr/dezoomify/dezoomify.html. For instance this one

saving dzi file password protected

how can i grab an image or dzi file thats password protected? i saw a minor explanation but it does not make any sense. running dezoomify.py with python3 on win, the web version obviously doesnt work on protected url's

diomedia.com : Invalid XML error

http://www.diomedia.com/imageDetails.do?imageId=14321102 Is the page I'm trying to rip the full-size image from. I fed that URL into Dezoomify, but now I'm getting this error: Error: Invalid XML: http://www.diomedia.com/,window.XMLHttpRequest&&XMLHttpRequest.prototype&&XMLHttpRequest.prototype.addEventListener&&window.XML (http://ophir.alwaysdata.net/dezoomify/zoommanager.js:107)

I'm not that code-savvy, so apologies if I'm just dumb; I tried looking into the source code but couldn't find the right .xml file, I guess.

Request support for National Gallery images

Hi, Lovasoa:
If you still remembered, I've received your kindly help some months ago.
And now I have some new issues to seek for your support.
Today I noticed that the National Gallery images does not work for me, the error is Unable to find a proper dezoomer for the given URL. (http://ophir.alwaysdata.net/dezoomify/zoommanager.js:170). And when I open the national gallery full image viewer(full screen view), the popup window was without address bar(url) which used to have! How could I find the exact url contains the zoomable image for national gallery? I've tried to find it out from inspect element, but failed.

Thanks a lot

Regression with polona.pl

Hello,
I'm trying to extract the image of zoom-based script of online bibliotheque. Previously I managed to extract over hundred of images, however, now I experience an issue as none of dezoomify options works. I end up with error:

Error !
Uncaught TypeError: Cannot read property 'documentElement' of null (http://ophir.alwaysdata.net/dezoomify/zoommanager.js:199)

The link I use to extract image is: http://polona.pl/item/5904599/0/
Previously I managed to simply copy the url to dezoomify and it was building up tiles allowing me to save a canvas.png file. If you could have a look at the script in case anything has changed, that would be highly appreciated!

Thank you!
Maria

Add support for zoom.it

I'm attempting to download this beautiful image displaying language and color I saw on this site:

http://fathom.info/colorful-language

Clearly you can see there's nothing there. So I went into the internet time machine and it works for some dates such as this:

https://web.archive.org/web/20150908100823/http://fathom.info/colorful-language

I've tried putting both sites into Dezoomify but it keeps returning:

Uncaught TypeError: Cannot read property 'documentElement' of null (http://ophir.alwaysdata.net/dezoomify/zoommanager.js:239)

This this a hopeless situation or is there a way to get that image?

National Library of Wales

Hi,

I've used your program in the past and I think that it is really great. I've tried to download an image from the National Library of Wales website but everything that I've tried before doesn't work. Am I being stupid or is there an issue with the file that is unsupported? The file is found here in Firebug console: value="zoomifyImagePath=delweddau/jws/jws00042.pff&zoomifyServerIP=www.llgc.org.uk&zoomifyServerPort=80&zoomifyTileHandlerPath=/zoomify/ZoomifyServlet&zoomifyInitialRotation=0"
Many thanks for your help, Emyr

add unit tests

Unit tests would prevent regression, and help find websites that change their APIs.

Can't get images from PUDL

Hi,
First of all,Dezoomify is great! This is the first time I use it and it really surprises me! I tested some zoomable files and it worked great! However, there comes a problem.
I want to download some images from this website but it always errors: http://pudl.princeton.edu/sheetreader.php?obj=qv33rw772
I found the .xml file but it still doesn't work.
I wish you can tell me how to solve it.
Thank you.

Add support for polona.pl

I have tried your way to find the dezoomify working link but it's always rather hard. I am having probs with Polona. I tried to get the source code, as I tried to compose the web link in different ways in the dezoomify blank space of ur page but.. no way.. This is the page I faced and no result. LOL
http://polona.pl/item/285246/26/

Initial message by @Horemheb on #5

Select the right dezoomer automatically

Sometimes, the dezoomer to use can be guessed from the URL. Sometimes, it can be guessed from the contents of the page.

We should be able to select the right dezoomer automatically in these cases.

Archieven.nl doesn't work

Add support for ZIF files?

Some sites use Zoomify images where the tiles are packaged in a single ZIF file instead of existing separately. Any change of adding support for those?

Update

ZIF file support hasn't been added to dezoomify yet, but two tools have been coded, that allow to extract an image from a ZIF file:

McCord Museum

Hi there,

First of all, let met thank you a lot for the work you've been doing. It's wonderful.
I think I'd also need a little hand here. I've been trying for hours to save images from the McCord-Museum (Montreal) website and I simply can't find out how to do it. I've tried to find the image path but couldn't find it.

Let's take this image for example: http://collection.mccord.mcgill.ca/fr/collection/artefacts/MP-0000.236.1

I've tried to extract the image with this URL but it doesn't seem to work. I've tried with another link (http://collection.mccord.mcgill.ca/scripts/large.php?accessnumber=MP-0000.236.1&zoomify=true&Lang=2&imageID=150623) which is the full-screen page of the same image, but it won't work either.

Could you please help me bit?
Thanks a lot again for your time and effort, it is greatly appreciated.

Antoine

Error: Unable to load tile

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.