I'm a bit lost - PDF metadata

Discussion related to "Everything" 1.5 Alpha.
Post Reply
Michi
Posts: 69
Joined: Thu Jul 28, 2022 9:23 am

I'm a bit lost - PDF metadata

Post by Michi »

Hi!

I've done some testing with PDF properties and assigned some metadata to a pdf file.

While using Directory Opus this looks like this:
Directory Opus columents
Directory Opus columents
directory opus.jpg (34.39 KiB) Viewed 2703 times
Searching in ET, added some columns, this file looks like:
ET search
ET search
et.jpg (43 KiB) Viewed 2703 times
I'm missing the title, the subject and the tags.

When opening the pdf in Adobe Reader, looking at the file properties, I see:
Adobe Reader
Adobe Reader
adobe.jpg (101.15 KiB) Viewed 2703 times
ET is configured to index properties in Options-Properties: Comment, Description, Subject, Tags, Title for *.pdf files, so I wonder why I'm missing at least some of those properties?
void
Developer
Posts: 15677
Joined: Fri Oct 16, 2009 11:31 pm

Re: I'm a bit lost - PDF metadata

Post by void »

Thank you for the bug report Michi,

Could you please send me your test pdf to support@voidtools.com

I'll look into this issue.

I'm wondering if indexing these properties is causing the issue..
Michi
Posts: 69
Joined: Thu Jul 28, 2022 9:23 am

Re: I'm a bit lost - PDF metadata

Post by Michi »

I would not say it's a bug, rather an issue right now :D

I assume that Directory Opus stores the metadata somewhere else. At least the comment is saved to an ADS as Leo from DO support team mentioned. Only the tags and the subject found their way into the PDF.

Nevertheless, ET did only index the comment so far, previously edited by using Directory Opus.

I just did an additional test, modified the subject and the tags again and had a look into Index Journal of ET:
index journal.jpg
index journal.jpg (54.06 KiB) Viewed 2647 times
Guess, ET recognized the file change. However, in the search result, still the comment is visible.

I've sent you the PDF file I'm currently testing on...

Thanks for looking into!
Michael
Michi
Posts: 69
Joined: Thu Jul 28, 2022 9:23 am

Re: I'm a bit lost - PDF metadata

Post by Michi »

Just to be clear: my additional modification, that was recorded by the journal, was correctly index by ET, however only the modified comment was indexed again - nothing else...
tuska
Posts: 960
Joined: Thu Jul 13, 2017 9:14 am

Re: I'm a bit lost - PDF metadata

Post by tuska »

Maybe this picture can help: Picture 3. -> Extended.
The pdf document was opened in Adobe Acrobat 11.0.23 to display the document properties.
 
2022-10-25_PDF metadata_Properties.png
2022-10-25_PDF metadata_Properties.png (127.3 KiB) Viewed 2637 times
...
void
Developer
Posts: 15677
Joined: Fri Oct 16, 2009 11:31 pm

Re: I'm a bit lost - PDF metadata

Post by void »

Thank you for sending the test PDF.

The metadata is being stored as XMP instead of PDF metadata.
Everything currently only looks at the PDF metadata.

The PDF is using Cross-reference streams.
Everything doesn't support Cross-reference streams yet.

I am guessing Windows Explorer also does not show any metadata under Properties -> Details for this PDF?
Everything will fall back to the system to gather properties for the PDF.
In Everything, you should see the same as Windows Explorer.

I am looking into adding native XMP support and Cross-reference stream support.
Michi
Posts: 69
Joined: Thu Jul 28, 2022 9:23 am

Re: I'm a bit lost - PDF metadata

Post by Michi »

Hi! Just got your e-mail and your answer! Many thanks for this!

To be fair, this document was the one and only I've seen so far with this issue, so it's certainly not a big problem. Anyway, I will look forward for your enhancement which will again bring ET one step ahead of others similar tools!
Michi
Posts: 69
Joined: Thu Jul 28, 2022 9:23 am

Re: I'm a bit lost - PDF metadata

Post by Michi »

void wrote: Tue Oct 25, 2022 11:17 pm I am guessing Windows Explorer also does not show any metadata under Properties -> Details for this PDF?
Yes, exactly!

With Adobe Reader installed, Explorer does not show even a single metadata property. However, Microsoft Edge is configured as the primary PDF viewer, if this matters.
pdf property.jpg
pdf property.jpg (29.96 KiB) Viewed 2591 times
pdf details.jpg
pdf details.jpg (53 KiB) Viewed 2591 times
David.P
Posts: 183
Joined: Fri May 29, 2020 3:22 pm

Re: I'm a bit lost - PDF metadata

Post by David.P »

Just discovered this interesting thread, which I'd like to join right away.

I am also looking for a way to search metadata in PDF files, e.g. the ones shown below:
Image

Have I set this up correctly below in Everything's options (using "Producer" as an example)?
Image

The file extensions *.mkv;*.mp4;*.avi;*.flv;*.webm were present by default, and I only added *.pdf.

Does Everything need to read the entire file contents of all corresponding files for indexing metadata? I'm asking because I have a few 100 GB of files on a slow remote Windows server over VPN.
void
Developer
Posts: 15677
Joined: Fri Oct 16, 2009 11:31 pm

Re: I'm a bit lost - PDF metadata

Post by void »

Does Everything need to read the entire file contents of all corresponding files for indexing metadata?
No.
Properties with the type: "Metadata" are stored in a file header.
Everything will only read the file header. (not the entire file content)



Properties with the type: "Content" will read the entire file content.
David.P
Posts: 183
Joined: Fri May 29, 2020 3:22 pm

Re: I'm a bit lost - PDF metadata

Post by David.P »

Thanks very much
Post Reply