0

I have JPG image with XMP meta data inside.
I'd like to read this data, but how?

$content = file_get_contents($fileName);
var_dump($content);

displays real number of bytes 553700 but

$len = strlen($content);
var_dump($len);

displays 373821

So, I can't simple do

$xmpStart = strpos($content, '<x:xmpmeta');

because I get wrong offset. So, the question is, how to find and read string from binary file in PHP? (I have mb_string option ON in php.ini)

UPD1:

I have some binary file. How can I check in PHP, this file contains several strings or not?

2
  • Ah, it's clearer now. Essentially, it shouldn't matter what kind of data is being used. Can you try whether strlen($content, "iso-8859-1") gives the correct value? Commented Sep 30, 2011 at 12:25
  • $pos = strpos($content, '<x:xmpmeta', 0, 'iso-8859-1'); now it pointers to right offset. Thanks. But how can I know about last encoding parameter? :) There is no information about this in php.net/manual/en/function.strpos.php huh... Commented Sep 30, 2011 at 19:28

3 Answers 3

1

Essentially, it doesn't matter what kind of data you are reading - strlen() et al. should always work.

What I think is happening here is that on your server, strlen() is internally overridden by mb_strlen() and the internal character encoding is set to UTF-8.

UTF-8 is a multi-byte encoding, so some of the characters in your (wildly arbitrary) byte stream get interpreated as multi-byte characters - resulting in a shortened length of 373821 instead of 553700.

I can't think of a better workaround than always explicitly specifying a single-byte encoding like iso-8859-1:

 $pos = strpos($content, '<x:xmpmeta', 0, 'iso-8859-1');

this forces strpos() (or rather, mb_strpos()) to count every single byte in the data.

This will always work; I do not know whether there is a more elegant way to force the use of a single-byte encoding.

Sign up to request clarification or add additional context in comments.

Comments

1

Getid3 is a PHP package that claims to be able to read XMP Metadata.

3 Comments

Ok, thanks. It can help, but if I have to read own strings from binary files? How can it be done?
@Lari you'd have to build your own JPG parser. While surely very interesting, it's likely to be a huge task.
Let's forget JPG. :) I have some binary file. How can I check in PHP, this file contains several strings or not?
0

The exif_read_data() PHP function could help the XMP meta data

More info here: http://php.net/manual/en/function.exif-read-data.php

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.