The conversion of HTML into MS Word Document mainly used in the web application to generate a .doc/.docx file with dynamic HTML content. Microsoft word document is the most popular file format to export dynamic content for offline use. Export HTML content to MS word functionality can be easily implemented using JavaScript. But, if you want to convert dynamic content to Doc, server-side interaction is needed.
The server-side export to word functionality is very useful to convert dynamic HTML content to MS word document and download as a .docx file. The MS word document can be easily generated with HTML content using PHP. In this tutorial, we will show you how to convert HTML to MS word document in PHP.
The HTML_TO_DOC class is a custom library that helps to generate MS word documents and include the HTML formatted content in the Word document using PHP.
<?php
/**
* Convert HTML to MS Word document
* @name HTML_TO_DOC
* @version 2.0
* @author CodexWorld
* @link https://www.codexworld.com
*/
class HTML_TO_DOC
{
var $docFile = '';
var $title = '';
var $htmlHead = '';
var $htmlBody = '';
/**
* Constructor
*
* @return void
*/
function __construct(){
$this->title = '';
$this->htmlHead = '';
$this->htmlBody = '';
}
/**
* Set the document file name
*
* @param String $docfile
*/
function setDocFileName($docfile){
$this->docFile = $docfile;
if(!preg_match("/\.doc$/i",$this->docFile) && !preg_match("/\.docx$/i",$this->docFile)){
$this->docFile .= '.doc';
}
return;
}
/**
* Set the document title
*
* @param String $title
*/
function setTitle($title){
$this->title = $title;
}
/**
* Return header of MS Doc
*
* @return String
*/
function getHeader(){
$return = <<<EOH
<html xmlns:v="urn:schemas-microsoft-com:vml"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:w="urn:schemas-microsoft-com:office:word"
xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv=Content-Type content="text/html; charset=utf-8">
<meta name=ProgId content=Word.Document>
<meta name=Generator content="Microsoft Word 9">
<meta name=Originator content="Microsoft Word 9">
<!--[if !mso]>
<style>
v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style>
<![endif]-->
<title>$this->title</title>
<!--[if gte mso 9]><xml>
<w:WordDocument>
<w:View>Print</w:View>
<w:DoNotHyphenateCaps/>
<w:PunctuationKerning/>
<w:DrawingGridHorizontalSpacing>9.35 pt</w:DrawingGridHorizontalSpacing>
<w:DrawingGridVerticalSpacing>9.35 pt</w:DrawingGridVerticalSpacing>
</w:WordDocument>
</xml><![endif]-->
<style>
<!--
/* Font Definitions */
@font-face
{font-family:Verdana;
panose-1:2 11 6 4 3 5 4 4 2 4;
mso-font-charset:0;
mso-generic-font-family:swiss;
mso-font-pitch:variable;
mso-font-signature:536871559 0 0 0 415 0;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{mso-style-parent:"";
margin:0in;
margin-bottom:.0001pt;
mso-pagination:widow-orphan;
font-size:7.5pt;
mso-bidi-font-size:8.0pt;
font-family:"Verdana";
mso-fareast-font-family:"Verdana";}
p.small
{mso-style-parent:"";
margin:0in;
margin-bottom:.0001pt;
mso-pagination:widow-orphan;
font-size:1.0pt;
mso-bidi-font-size:1.0pt;
font-family:"Verdana";
mso-fareast-font-family:"Verdana";}
@page Section1
{size:8.5in 11.0in;
margin:1.0in 1.25in 1.0in 1.25in;
mso-header-margin:.5in;
mso-footer-margin:.5in;
mso-paper-source:0;}
div.Section1
{page:Section1;}
-->
</style>
<!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1032">
<o:colormenu v:ext="edit" strokecolor="none"/>
</o:shapedefaults></xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1"/>
</o:shapelayout></xml><![endif]-->
$this->htmlHead
</head>
<body>
EOH;
return $return;
}
/**
* Return Document footer
*
* @return String
*/
function getFotter(){
return "</body></html>";
}
/**
* Create The MS Word Document from given HTML
*
* @param String $html :: HTML Content or HTML File Name like path/to/html/file.html
* @param String $file :: Document File Name
* @param Boolean $download :: Wheather to download the file or save the file
* @return boolean
*/
function createDoc($html, $file, $download = false){
if(is_file($html)){
$html = @file_get_contents($html);
}
$this->_parseHtml($html);
$this->setDocFileName($file);
$doc = $this->getHeader();
$doc .= $this->htmlBody;
$doc .= $this->getFotter();
if($download){
@header("Cache-Control: ");// leave blank to avoid IE errors
@header("Pragma: ");// leave blank to avoid IE errors
@header("Content-type: application/octet-stream");
@header("Content-Disposition: attachment; filename=\"$this->docFile\"");
echo $doc;
return true;
}else {
return $this->write_file($this->docFile, $doc);
}
}
/**
* Parse the html and remove <head></head> part if present into html
*
* @param String $html
* @return void
* @access Private
*/
function _parseHtml($html){
$html = preg_replace("/<!DOCTYPE((.|\n)*?)>/ims", "", $html);
$html = preg_replace("/<script((.|\n)*?)>((.|\n)*?)<\/script>/ims", "", $html);
preg_match("/<head>((.|\n)*?)<\/head>/ims", $html, $matches);
$head = !empty($matches[1])?$matches[1]:'';
preg_match("/<title>((.|\n)*?)<\/title>/ims", $head, $matches);
$this->title = !empty($matches[1])?$matches[1]:'';
$html = preg_replace("/<head>((.|\n)*?)<\/head>/ims", "", $html);
$head = preg_replace("/<title>((.|\n)*?)<\/title>/ims", "", $head);
$head = preg_replace("/<\/?head>/ims", "", $head);
$html = preg_replace("/<\/?body((.|\n)*?)>/ims", "", $html);
$this->htmlHead = $head;
$this->htmlBody = $html;
return;
}
/**
* Write the content in the file
*
* @param String $file :: File name to be save
* @param String $content :: Content to be write
* @param [Optional] String $mode :: Write Mode
* @return void
* @access boolean True on success else false
*/
function write_file($file, $content, $mode = "w"){
$fp = @fopen($file, $mode);
if(!is_resource($fp)){
return false;
}
fwrite($fp, $content);
fclose($fp);
return true;
}
}
The following example code convert HTML content to MS word document and save it as a .docx
file using HTML_TO_DOC class.
1. Load and initialize the HTML_TO_DOC class.
// Load library
include_once 'HtmlToDoc.class.php';
// Initialize class
$htd = new HTML_TO_DOC();
2. Specify the HTML content want to convert.
$htmlContent = '
<h1>Hello World!</h1>
<p>This document is created from HTML.</p>';
3. Call the createDoc()
function to convert HTML to Word document.
$htmlContent
).my-document
).$htd->createDoc($htmlContent, "my-document");
Download word file:
To download the word file, set the third parameter of the createDoc()
to TRUE.
$htd->createDoc($htmlContent, "my-document", 1);
You can convert the HTML file content to Word document by specifying the HTML file name.
$htd->createDoc("source.html", "my-document");
If an extension is not mentioned in the file name passed in the createDoc() method, it will be saved as a .doc file format by default. To save a word document as a .docx file format, specify the file name with an extension.
$htd->createDoc($htmlContent, "my-document.docx");
Export HTML to MS Word Document using JavaScript
There are various third-party library is available for HTML to Word conversion. But, you can convert the HTML content to Word Document using PHP without any external library. Our HTML_TO_DOC class provides an easy way to convert dynamic HTML content to Word documents and save/download as a .docx file using PHP. You can easily enhance the functionality of the HTML To Doc class as per your needs.
Do you want to get implementation help, or enhance the functionality of this script? Click here to Submit Service Request
Thank you for great Tutorial. How do you set Landscape Orientation ?
Good morining. Code has one glitch in HtmlToDoc.class.php
Line 55: $return = <<<EOH
There is an extra space after the H that gives the following error: Parse error: syntax error, unexpected '<<' (T_SL)
Otherwise it works perfectly. Thank you!
Good approach for small conversions! HTML to DOCX can also be done using phpdocx (phpdocx.com) that does the transformation using OOXML tags, so it’s compatible with all DOCX readers and support many tags and styles.
hi i want to convert doc and docx both type of file to html please help me
Great work. But how can I make html -tag to word?
Great Tutorials
Hello,
Great solution!
I am looking fo a solution to integrate a input text field into the word doc.
Has someone an idea?
Thank you for great Tutorial. How do you set Landscape Orientation ?
How can I pass the already existing template to this and do the changes?
How add page number?
shall we remove the page margins
Great Tutorials