Quantcast
Channel: PDF tips & tricks
Viewing all articles
Browse latest Browse all 125

How to extract images from PDF documents (C# sample)

$
0
0

Code sample


Extracting image data stored in document’s image XObjects is quite straightforward using Apitron PDF Kit .NET component. The sample code shown below can be used to extract all images from any PDF document.

publicvoid ExtractImages()
{
        // open document
    using (Stream stream = File.Open("Apitron PDF Kit in Action.pdf",FileMode.Open))
    {
        // create document object
        FixedDocument doc = newFixedDocument(stream);
        Directory.CreateDirectory("Images");
        int imageIndex = 0;

        // enumerate document's pages
        foreach (Page page in doc.Pages)
        {
            // extract images from PDF page
            foreach (ImageInfo info in page.ExtractImages())
            {
                // construct image file name
                string imageFileName=Path.Combine("Images",string.Format("{0}.{1}",
imageIndex++, "bmp"));

                // save image to file
                using (Stream imageStream = File.Create(imageFileName))
                {
                    info.SaveToBitmap(imageStream);
                }
            }                   
        }
    }
}


This code enumerates and saves all images found in PDF document. In addition, the ImageInfoobject returned by the Page.ExtractImages()call can be used to examine various properties related to image which is being extracted, e.g. colorspace, width, height etc. The library is cross platform and can be used to create applications targeting Windows, Windows Store, Windows Phone, Xamarin and Mono platforms.

You can find many other samples related to PDF content extraction in our free bookor in blogposts. Contact us if you have any questions and we’ll be happy to help.

Downloadable version of this article can be found by the following link [PDF].

Viewing all articles
Browse latest Browse all 125

Trending Articles