Code Monkey home page Code Monkey logo

Comments (6)

zzemchik avatar zzemchik commented on June 12, 2024 1

I think I managed to figure it out. Now my program works as I wanted. Thank you very much for your help and for the library)

from hummusjs.

galkahana avatar galkahana commented on June 12, 2024

removing an object from a file is not really an option with PDF Modifications. You may mark it as deleted with objectsRegistry.DeleteObject(xobjectID); which means that a reader application ignores its content.

I'm fairly sure inPDFWriter.GetObjectsContext().StartModifiedIndirectObject(xobjectID); is not required, and delete is sufficient.

Depending on how exactly you are copying the file you can avoid copying it in the first place or replace (via ReplaceSourceObjects) it with an object that's content is null (just create a new object, make its content a null pdf keyword and finish it, now you got a null object).

from hummusjs.

zzemchik avatar zzemchik commented on June 12, 2024

Oh, really hard. How can I copy objects step by step while replacing old ones? I'm trying to do it something like this:

inPDFWriter.StartPDF("/home/ivan/pdf/test_2_image_modyfy.pdf", ePDFVersion14);
std::shared_ptr = copyingContext(inPDFWriter.CreatePDFCopyingContext("/home/ivan/pdf/mini_pdf.pdf")); //the file I'm trying to copy

copyingContext->CopyNewObjectsForDirectObject(objectIDTypeList); // let's imagine that I created objectIDTypeList
inPDFWriter.EndPDF();
And when I do this, my PDF is always broken, some objects inside are not completed

In general, my task is to copy a PDF file with the substitution of some objects (pictures), how can I do this? Sorry to waste your time, I'm just a little short on documentation...

from hummusjs.

galkahana avatar galkahana commented on June 12, 2024

Its alright, sorry for the doc being short.
OH and now i realize this actually belongs in PDFWriter...right? the code is C++. so let me at least answer using the C++ names.

ok, so for general copying of a pdf but replacing some of its object you would want a combination of copying the full pdf + using copyingContext.ReplaceSourceObjects for those objects you want to replace.

To copy the full PDF you can use code similar to the one in the library RecryptPDF function, which basically does just that - it copies the full pdf by recursively copying from the root object, using copyingContext->CopyObject (which is recursive), then sets the root of the new PDF to be the copied object. you can find the code here:
https://github.com/galkahana/PDF-Writer/blob/master/PDFWriter/PDFWriter.cpp#L846

Now, prior to calling CopyObject you'll want to create the replacement images in the target PDF. Alternatively allocate object IDs from them to be used after copying, if you prefer to do them afterwards.
Once you have the new ids, and can collect the images ids of the images you want to replace, call ReplaceSourceObjects with the mapping:

void ReplaceSourceObjects(const ObjectIDTypeToObjectIDTypeMap& inSourceObjectsToNewTargetObjects);

The map keys are the original images IDs (those in the source document). and the values are the target image ids.

So the code should largely look something like this:

// in advance create inPDFWriter with the target file, lets also assume that you took care of creating the new images in the file and that you got their mapping to source ids in `ObjectIDTypeToObjectIDTypeMap sourceImagesToTargetImages`

PDFDocumentCopyingContext* copyingContext = inPDFWriter.CreatePDFCopyingContext("/home/ivan/pdf/mini_pdf.pdf");

// set the replacement map, prior to copying
copyingContext->ReplaceSourceObjects(sourceImagesToTargetImages);

// get its root object ID
PDFObjectCastPtr<PDFIndirectObjectReference> catalogRef(copyingContext->GetSourceDocumentParser()->GetTrailer()->QueryDirectObject("Root"));

// deep-copy the whole pdf through its root - return root object ID copy at new PDF
EStatusCodeAndObjectIDType copyCatalogResult = copyingContext->CopyObject(catalogRef->mObjectID);

// set new root object ID as this document root
pdfWriter.GetDocumentContext().GetTrailerInformation().SetRoot(copyCatalogResult.second);

delete copyingContext;

// you probably want to end the PDF after that...at least given that we set the root object the apis for adding pages and such probably wont function properly. there's more of a lower level treatment in this case.
inPDFWriter.EndPDF();

Note - depending on your overall intent you might want to replace only part of the document, like specific pages. in this case, dont query the original root and set the result root on the target document. rather query the original object (say page) and create a relevant target object (say a page). we can get into this difference if it matters to you.

from hummusjs.

zzemchik avatar zzemchik commented on June 12, 2024

Yes it works as I wanted, thank you very much!
I would like to ask a couple more questions, is CreateImageXObjectFromJPGFile the only way to create an image? I tried CreateXObjectFromJPGFile, but I donโ€™t quite understand how to interact with it so that the image is replaced.
And the explanation from the note, do you mean the scenario when I need to copy not the entire PDF, but only individual pages? Itโ€™s just that in my case I always need to copy the entire PDF. And as I understand it, there will always be recursive copying.

from hummusjs.

galkahana avatar galkahana commented on June 12, 2024

First on yr second question: if you need to copy the whole PDF don't mind my note :).

As for images, CreateImageXObjectFromJPGFile and CreateImageXObjectFromJPGFile are good choices, where the latter will create a form xobject with the native size of the image, instead of 1X1 image object that you should scale (well..maybe you'll need to scale the form as well).
there's similar methods also for png images and tiff images. never took the time to just do a single method for all of those. maybe something to add at some point.

CreateFormXObjectFromWHATEVERFile gets a file path (or stream) and then embeds the image in the file.
again, CreateImageXObjectFromWHATEVERFile, if available, will provide a 1X1 image, that you can size, and CreateFormXObjectFromWHATEVERFile uses what size it reads from the page (if you got your own size...maybe fitting to the box of the original image, you may want ot create a wrapper form doing the sizing...talk to me if you want to know how to do this and can find an example/doc).

CreateFormXObjectFromWhateverFile returns a PDFFormXObject pointer which you can use to acquire its id with formXObject->GetObjectID(). That id is what you want to use for the "target" value of the source to target map provided for the later ReplaceSourceObjects.

you should also delete that formXObject object once you are done with it.

b.t.w if using CreateImageXObjectFromJPGFile instead you will get back a PDFImageXObject which you can use its GetImageObjectID to get its uid. you'll probably want to create a from to size it up and then use that form ID in your list.

there's examples on how to use these function in the test files of PDFWriter.
For example, here's a test using CreateImageXObjectFromJPGFile

depending on what exactly you end up doing i can provide further help, but lets see first how do you want to approach this.

from hummusjs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.