Quicklinks

Sponsor

Converting MS Office 2003 documents into PDF

Posted by Tobias on December 16th 2009

Believe it or not, everyone hasn’t got the latest Microsoft Office installed in their environment. That is why I received the task to automate the conversion of documents to PDF. The company which ordered this project always sent their official documents in Word 2003 format to their customers and received a lot of complaints because their customers couldn’t open the documents. PDF is a widely spread format which almost every organization can read and is perfect to use when sending documents. The company wanted an easy way to convert Word documents to PDF from within their CRM (Customer Relationship Management) system. Using Office 2007 this isn’t a very hard thing to do, you just save the document as a PDF. However, this company is still using Office 2003 which doesn’t have PDF support.

So, how do we solve this? Simple, we install one of the great freeware PDF printers available online. Sure this would solve a part of the problem but the users would still have to open the file from within the CRM system, print it using the PDF printer, find the PDF and send it to the customer. The problem with this is that users are lazy so we need a really simple way of converting the files from within the CRM system, preferably by a one button click.

So, how can we do this? Well, by installing a PDF printer half the job is already done. To really automate the process we must:

  1. Start Word (or some other program)
  2. Open the document
  3. Print the document to the PDF printer
  4. Pick up the newly created PDF and import it into the CRM system

Easy! Right? It would be if the PDF printer had some way of taking a filename as an argument and use it to create the PDF. Most of the PDF printing software I tested did not have this feature, some had it but it would cost you. To solve this we need to know how the PDF printers work. Most of them actually work the same way; by installing a plain Postscript printer driver, creating a postscript file and invoking Ghostscript to convert the postscript into PDF. Sounds simple, right? So why won’t we do this ourselves you say? Good question! Lets!

What we need:

I used C# to implement this solution but I would imagine that almost any language would do. Almost... By using .NET we have access to the Office COM interface which we can use to control Word. I’ll be using MS Word in this example but the principle is the same for other programs like Excel or Powerpoint. I will stop rambling now and give you a code example.

Example code

private void ConvertUsingCutePdf(string sFileName)
{
  ApplicationClass oWord = null;
  try
    {
      if (sFileName.EndsWith(".doc"))
      {
        string sPdfFileName = @"c:\ExportedFile.pdf";
        string sPsFileName = @"c:\TempPSFile.ps";
        //Start Word
        oWord = new Word.ApplicationClass();
        oWord.Visible = false; // Hide the application from the user
        string sCurrentPrinter = oWord.ActivePrinter; // Remember active printer
        oWord.ActivePrinter = @"CutePDF Writer";

        object fileName = sFileName;
        object falseValue = false;
        object oTrue = true;
        object oFalse = false;
        object missing = Missing.Value;

        //Open the document                   
        Document doc = oWord.Documents.Open(ref fileName, ref missing,
                       ref oTrue, ref oFalse, ref missing, ref missing,
                       ref missing, ref missing, ref missing, ref missing,
                       ref missing, ref missing, ref missing, ref missing,
                       ref missing, ref missing);

        object copies = "1";
        object pages = "";
        object range = Word.WdPrintOutRange.wdPrintAllDocument;
        object items = Word.WdPrintOutItem.wdPrintDocumentContent;
        object pageType = Word.WdPrintOutPages.wdPrintAllPages;
        object outputFilename = sPsFileName;

        //Print the document
        oWord.ActiveDocument.PrintOut(ref oTrue, ref oFalse, ref range,
                                      ref outputFilename, ref missing,
                                      ref missing, ref items, ref copies,
                                      ref pages, ref pageType, ref oTrue,
                                      ref oTrue, ref missing, ref oFalse,
                                      ref missing, ref missing,
                                      ref missing, ref missing);

       oWord.Documents.Close(ref oFalse, ref missing, ref missing);
       oWord.Visible = false; // Hide Word again
       oWord.ActivePrinter = sCurrentPrinter; // Restore users printer

       //Convert the postscript file into PDF using Ghostscript
       string sArgs = string.Format(" -q -dSAFER -dBATCH –dNOPAUSE
                                      -sDEVICE=pdfwrite
                                      -sOutputFile=\"{0}\" \"{1}\"",
                                      sPdfFileName, sPsFileName);
       System.Diagnostics.ProcessStartInfo startInfo =
                                           new ProcessStartInfo();
       startInfo.FileName = @"gswin32c";
       startInfo.Arguments = sArgs;
       startInfo.CreateNoWindow = true;
       startInfo.WindowStyle = ProcessWindowStyle.Hidden;
       System.Diagnostics.Process proc = Process.Start(startInfo);

       //Wait 10 sec for the process to complete
       proc.WaitForExit(10000);
       if (!proc.HasExited)
       {
         proc.Kill();
         proc.Dispose();
       }

       //Close Word
       oWord.Quit(ref oFalse, ref missing, ref missing);
       oWord = null;

       //Delete temporary files
       File.Delete(sPsFileName);
     }
     else
       MessageBox.Show("This program only supports Microsoft Word documents.");
   }
   catch (Exception ex) { MessageBox.Show(ex.ToString()); }
   finally
   {
     if (oWord != null)
     {
       object oFalse = false;
       object oMissing = Missing.Value;
       oWord.Quit(ref oFalse, ref oMissing, ref oMissing);
     }
   }
 }

So there you have it! It would need some better error handling and some other changes to suite your exact needs but it can be a good starting point.

A few notes regarding this code:

Post a comment




Comments
Posted by WNfsQQSnANepaKWTmsd on July 29th 2011 23:25
This could not possibly have been more heflpul!