View Full Version : image compression and mime-type from WordML

06-13-2005, 07:10 PM
Not quite sure really what to call this, so I'll just explain.
I'm grabbing the image out of a wordML binData element, which appears to be base64 encoded, making things all the easier. I take this information and store it in my database, where I output it from another script call, image.php?image=00300003.png for instance. Everything is set up. Everything is working fine. However, I noticed that my jpg files for instance are enormous in size. One of my jpegs is 850x850 (I'll be scaling that of course, thats far too large for my purposes), and weighs in at a beautiful 0.4Mb in size. Now, I dumped this data into a file so that I can manipulate it with my applications and found that besides programs like psp which are fairly good with their compression, I threw it into a standard microsoft paint, made no alterations, and resaved it as a .jpg file. This new file weighs in at only 96kb in size. Quite a difference over the 0.4mb filesize, and quite interesting considering its just from a standard msp program.

Now, my question is more for the GD/Image experts here, that may be able to help me out. To start with, I cannot alter the images on my home pc, as it is (will) be uploaded from an external network via CRON to my webhost. As I mentioned, generating the image is not the hard part, its the rest that follows.
What I'm looking for, is two things:
1. Can this be easily converted to a .png file so I can save the file information within my database - I'm assuming this can be using the GD, but this I'm new to.
2. Is there a way to compress these files, so that they don't exceed say, 120kb in size (of course, depending on the image in question, take the 0.4mb one for example).

Many thanx in advance!

06-14-2005, 01:15 AM
What I'm looking for, is two things:
1. Can this be easily converted to a .png file so I can save the file information within my database - I'm assuming this can be using the GD, but this I'm new to.

Image conversion like this is simple using GD, you can use imagecreatefromstring() to work with your stored image data directly and imagepng to turn that data into a .png file. The only problem is that, as far as I know, the only way to then save that data back into the database is to output it as a file first then read the file contents back in and save into the database, so you'll probably want to delete that file when you do so.

$blob = // fill this with your db blob data
$im = imagecreatefromstring($blob);
imagepng($im, 'myimage.png');
$dbim = file_get_contents('myimage.png'); // you can now submit $dbim to the database

2. Is there a way to compress these files, so that they don't exceed say, 120kb in size (of course, depending on the image in question, take the 0.4mb one for example).
Well, with imagejpeg() there is an optional quality argument which would aid in compression, though GD isn't the greatest for compression and your results may be less than desirable. You may find it's better to simply convert to .png which already uses lossless compression, just accept the slightly larger size in your database and output using imagejpeg with a much lower quality where that is acceptable. That way if you do need to output a higher quality version of the image, you still have it available, whereas if you'd compressed the image as a lower quality jpeg and then needed to output it at a higher quality, you'd effectively lose the option to do so.

06-14-2005, 05:40 AM
Great, I'll look into using your suggestions here.
I've got no problem with creating the image, as is I already have a temporary directory to create the image into so I can grab its size.
Now, I'm clueless when it comes to images and graphics, and this isn't exactly the forum to ask this in but you may know the answer anyway. The xml file I grab this from is ~750kb in size, and assuming that the increase for the base64 coding is 33% off the image data, that means almost the entire file is composed of the image. I asked about conversion to the png file as I understand that png will better handle an image which has been scaled down in size, and make it more portable for expansion/contraction. I plan to take that 850x850 image, and scale it to like 100x100 and save it as a png file. I'm assuming that the smaller the image, the smaller the data thats required for it - but this is where I'm unsure when it comes to images. Is this how an image works as well?

06-14-2005, 08:50 AM
Yep, it's always been my experience that so long as you're reducing the size by a noticable amount the file size will reduce accordingly and with a reduction from 850x850 to 100x100 you should see a big improvement, I'd guess the final file will be under 40k.

Conversion to gif or png first will give better results as they have relatively smaller sizes when rendered smaller whereas jpeg has relatively smaller size when rendered larger.

06-14-2005, 06:20 PM
6kb to be precise :)
Thanx delinear, your a life saver. Take a quick look see, obviously I just threw this together here, and its not exactly what I would call the most stable, but let me know if there is anything you would improve upon. Also, I'll be unlinking my original files and dumping the data into a database, but I'll take care of that after:

$name = split('wordml://', $image->getAttribute('name'));
if (!file_exists('./images/bin/' . $name[1]))
$imagedata = base64_decode($image->nodeValue);

$handle = @fopen('./images/bin/' . $name[1], 'w+b');

if (fwrite($handle, $imagedata) === FALSE)
echo 'Cannot modify/create data';

$image = './images/bin/' . $name[1];
$imagedata = getimagesize($image);
switch ($imagedata['mime'])
case 'image/png':
$newimage = imagecreatefrompng($image);
case 'image/jpeg':
$newimage = imagecreatefromjpeg($image);

if (!empty($newimage))
$maxwidth = 100; // We shall scale down all objects to a max of 100px in width and resize according to % aspect ratio

if ($imagedata[0] > $maxwidth)
$new_size_ratio = number_format($maxwidth/$imagedata[0], 3); // Round to three decimal places for fairly accurate precision
$new_width = $maxwidth;
$new_height = floor($imagedata[1]*$new_size_ratio);

// Create the new image with these proportions:
$dest_image = imagecreate($new_width, $new_height);

// Create the new image:
imagecopyresized($dest_image, $newimage, 0, 0, 0, 0, $new_width, $new_height, $imagedata[0], $imagedata[1]); // Umm... this right?
imagepng($dest_image, './images/bin/' . $name[1]); // Needn't worry about the extension ATM

unset($image, $imagedata, $newimage);

Interestingly enough, this is all contained within an xpath foreach query, though if I don't unset the $newimage data, it will retain the previous image's information. Thats kind off odd, and it shouldn't really do that as it will overright the original data. Of course, I may be mistaken, as originally I had the mime case set as image/jpg instead of image/jpeg so obviously it had no case :P
This is of course a wip and isn't completed, but please feel free to comment on whats there, I'm new to the entire GD library and how it works.

Also, I thought it to be worth mentioning, that one of the files created is well... not deletable/editable. It has created an image named 03000001.png and any attempts that I make on my filesystem to edit or delete this file, results in a nice unalterable message, though I see nothing that could currently be accessing the file. I had also reset my computer just in case there had been a problem with it still linked to the executing script, but nothing, its still unalterable. Any ideas on that? (Yeah, I know its nowhere near php related :P)

06-14-2005, 09:51 PM
That all looks good, although I've found I get better results with imagecopyresampled rather than imagecopyresized, but as always your mileage may vary.

As for the problem with $newimage, have you tried using imagedestroy($newimage)? Although if unsetting it works then that's fine too, but really imagedestroy is specifically designed to clear the image data from any created image so that should be the proper way to do it.

Incidentally, I found a snippet on php.net about using ob to capture the image data into a string which you can put directly into the database without having to do the whole create a file, read the file into a variable then unlink the file routine.

$imagevariable = ob_get_contents();
I've not tested it but I guess it's obvious really, I should have thought of it before :rolleyes:

I don't know about your undeletable file though, I'm afraid. I guess just mark it down to freaky computer crap :o

06-15-2005, 05:56 AM
Yeah, I noticed that comment made as well. That appears binary though, which isn't quite what I need, though conversion would be a snap as well I suppose. ATM, I want to get it down pat with files first, than work up with the database again.
As for that file, I managed to get it off there, not sure quite what I did, but I'm going to throw it up to my small ram capacity instead. I got it off, then ran it again it it appeared again, same scenario :P
Thanx for your help, I've also changed imagecreate to imagecreatetruecolor as well, assuming that will probably help.

06-17-2005, 09:04 PM
I have tried to implement your suggestion to use imagejpg, but even with the modifier for a filename, it is still outputting raw data to the browser, when i need that data stored as a string to upload to a mysql database.

any suggestions?

06-18-2005, 04:03 PM
Its been a couple of days since I worked on it, and you may need to modify slightly for your own purposes/database, but this is what I have come up with:

// Basing XPath off of resource $xml:
$xpath = new DOMXPath($xml);

$query = $xpath->query('//w:binData');
foreach ($query AS $image)
$name = split('wordml://', $image->getAttribute('name'));

$imagequery = mysql_query("SELECT * FROM images WHERE name='" . $name[1] . "'");

if (mysql_num_rows($imagequery) <= 0)
// Currently not within the database, add it in:
// This set of image creations will be very intensive to both the server and the database. Once built, all is well.
// Take current image string, create a new png image, and save it within the database in a (probably) 100x100 true color image layered image.
// This should provide a significantly smaller data size as well as create portability for larger/smaller file sizes.

$maximum_width = 100; // This is the max width, the height is proportional to the width so a calculation shall be done.

$original_image = @imagecreatefromstring(base64_decode($image->nodeValue));

if ($original_image !== FALSE)
// Only continue with this if the imageformat is supported by the GD library.
$original_width = imagesx($original_image);
$original_height = imagesy($original_image);

if ($original_width > $maximum_width && $maximum_width != -1) // -1 denotes no width container.
// Width is greater than $maximum_width, lets scale it down:
$resize_ratio = number_format($maximum_width / $original_width, 3); // Create a thousandths decimal ratio
$new_width = floor($maximum_width);
$new_height = floor($original_height * $resize_ratio);
$new_width = $original_width;
$new_height = $original_height;

// Now, we have (possibly) new image sizes. Lets create it up:
$new_image = imagecreatetruecolor($new_width, $new_height);
imagecopyresampled($new_image, $original_image, 0, 0, 0, 0, $new_width, $new_height, $original_width, $original_height);

if (isset($new_image))

$created_image = base64_encode(ob_get_contents());

$image_width = array(
'original' => $original_width,
'new' => $new_width
$image_height = array(
'original' => $original_height,
'new' => $new_height

mysql_query("INSERT INTO images
(id, name, data, width, height, createdtime, lastaccessedtime)
(NULL, '" . $name[1] . "', '" . $created_image . "',
'" . serialize($image_width) . "', '" . serialize($image_height) . "', '" . time() . "',
'" . time() . "')



Now, as you can see I have implimented using the output buffering technique which had been posted at php.ca. Originally, I hadn't needed to use this, as I had written the file to a directory, and then copied it over into the database.
What we don't have here, ATM is height maximums. This is simply because it matters not to me how high the image is, mearly the width is all I care about.
Also, its not really controlled by mime-types either, this will accept any GD available mime. This of course, is easily alterable.
For my purposes, I'm using png, and doing some with gif actually, as I found my image is not true color (or if it is, the guess work for the gif is of excellent quality for size :P). To use this with your jpeg images, simply use imagejpeg instead of imagepng. Also, there is more to my code than whats here, but I think you get the picture.
After this, I simply created an image.php file that outputs the image info for use in an <img> tag using the name of the w:binData cell (for mine, $name[1] creates say... 0200002.jpg). Works alright, I'm going to plug away at it a bit more until I'm happy, but right now, it will suffice.