Objective
- Combine the use of Office Word and Oracle ORDS to create a document generator controlled from Word.
- Automatically export these documents to PDF format.
- Develop a simplified service (Windows only) capable of launching multiple instances of document generation.
Description
This is a prototype I developed that prioritizes the use of Office Word both as a design tool and for merging with data from an Oracle REST ORDS service. I tested several approaches with data obtained in JSON and XML formats.
This project, available from Github, is called WOORDS
. It is implemented as a module written in VBA within a Word template. This module handles the following tasks:
- Obtaining REST data from an ORDS server, in JSON or XML format
- Loading a Word template suitable for the document to be generated
- Merging elements using Content Controls
- Exporting the document as a PDF
- Optionally, sending the PDF document via a REST API.
Woords relies on a third-party module, vba-json, for reading documents in JSON format.
A command line script, woords.cmd, launches Office Word with the needed parameters.
Let’s have a look to the way Word is started:
WINWORD.EXE /q /p%1 /z%REPDIR%\templates\woords.dotm
The tricky part involves calling Word with parameter passing. Specifically, in our case, it is necessary to pass information regarding the template to be used and possibly an ID to select specific data. In command mode, there is no selector to add parameters, but I found that /p was not used, so I utilized it to convey the parameters.
The structure of the parameter string is defined as follows:
- c:config
- i:pkid
- f:folder Name
ex: c:order:i:652:f:zzzzzz
All these details are hidden in the woords command script. We just need to launch the Woords in the following ways:
- From the command line
- Via a URL call
- Through a WebSocket (not implemented) see my other post
Architecture
Important prerequisite: The solution described in this article only works on a physical machine running Windows, as Office Word needs to be activated, or in a Windows VirtualBox under Linux, or in a Windows Docker container under Windows.
In server mode, the HTTP listener receiving the request must be able to trigger the execution of Office Word in command mode, and the listener must be on the same machine as Office Word. Therefore, we can imagine the server composed of an HTTP/HTTPS listener based on Node.js (using the access
module), for example, or an Apache HTTP server with PHP installed. Woords includes a server, swoords.js, running with Node.js.
Use Cases
Woords is suitable for the following types of documents:
- Master/detail style documents with one document per PDF file (invoice, purchase order, employee record, product sheet, etc.)
- Simple lists in table form
Woords isn’t for:
- Reports with grouping and page breaks based on break levels
Using in server mode
Since the operation of a VBA module is single-threaded, it is not feasible to make it a service at its level. However, it is possible to launch multiple instances of Word from a Node.js server. The principle is very basic, as the server launches Word in command line mode as discussed previously.
This allows for a certain level of parallelism. However, it is not advisable to expect intensive use with a large number of simultaneous requests, as this quickly leads to stability issues, resulting in Word instances that remain open.
A simple example server is provided with Woords (server\swoords.js). As a prerequisite, it is necessary to have previously installed Node.js along with the dependencies:
- express
- fs
- crypto
- child_process
This server provides two endpoints:
/viewpdf/:config/:pkid?
for launching pdf generation and viewing output pdf file/pdf/:config/:pkid?
for generation only
Office Word specificities used for merging
To represent variable fields, I chose to use Content Controls (CC) and Repeating Content Controls (RCC).
Each CC has at least the following standard attributes:
- Title
- Tag
I use the Tag attribute to record the name of the JSON or XML node with which it should be merged. In the case where it needs to display an array, the êrging will stand between a RCC and a Array type node wich the same name.
Setup and Use
Installation
Installation steps
- Download woords01.zip from https://github.com/patrickmonaco/woords
- Create a directory (ie: c:\app\woords) and unzip woords.zip
- In Office Word, add this directory to the list of trust locations in the Trust Center
- Complete the content of the seettings.xml file by entering the endpoint for obtaining OAUTH tokens and the endpoint corresponding to the REST module
- Optional: Install Node.js if you want to set up a Woords service
- Start woords by entering:
bin\woords configfile
Access Control
The data is obtained by Word from the VBA code and requests to the REST APIs are made via the MSXML2.XMLHTTP class from VBA.
Access is controlled by OAUTH with a Client Credentials type client.
This means that you must create an OAUTH Client of the Client Credentials type at ORDS level (see appendices) and save the Client ID and Client Secret in a settings.xml file accessible by Word. (see appendices)
- Create an OAUTH client (see appendices) and copy the client Id and Secret values
- Adapt the settings.xml file by copying the ID and SECRET.
- You must indicate the endpoint to obtain a valid token and the endpoint of the module. Do not put a trailing slash for the value of the ENDPOINT tag.
Organizing models and data sources
The same Word template can be used with different data types and vice versa. For this reason, I introduced the notion of configuration. This is an XML file which indicates the template, the REST endpoint to use as well as miscellaneous informations:
<CONF>
<TEMPLATE>Vision3</TEMPLATE>
<ROOT_NODE>ORDERS</ROOT_NODE>
<HANDLER>/ordersxml/</HANDLER>
<HEADER>HEADER</HEADER>
<SPLIT>Y</SPLIT>
<LIST>N</LIST>
<SAVEDOC>N</SAVEDOC>
<POST_URL>https:/xxx/</POST_URL>
<DEBUG_LEVEL>0</DEBUG_LEVEL>
</CONF>
In the configuration above, we tell Woords to use a template called vision3 and to fetch the data by making a REST call to a handler called /ordersxml/ and which must be suffixed at the endpoint specified in a global settings file.
There must be a configuration for each document template. These files are saved in the conf directory but it is also possible to place them in a global directory accessible from a REST service. In this case, you must create a woo_repo table (see the setup\woords_createrepository.sql creation script) and you must indicate the corresponding handler in the settings file described below: /conf/ . This handler is provided in the gpm.demo.sales module (setup\ORDS_REST_DEMO_gpm.demo.sales)
SAVEDOC: Indicates whether you wish to keep the intermediate Word file that was generated.
POST_URL: REST handler to which to post the PDF file or not. NOT IMPLEMENTED.
Global Settings for Woords
<CONF> <OAUTH_ENDPOINT>https://xxx/ords/demo/oauth/token</OAUTH_ENDPOINT> <ID>XXXXXXXXXXXXXXXXXXXXXXX</ID> <SECRET>ZZZZZZZZZZZZZZZZZ</SECRET> <ENDPOINT>yyyyyyyyyyyyyy/ords/demo/sales</ENDPOINT> <VERBOSE>N</VERBOSE> <DEBUG>N</DEBUG> <REPO>/conf/</REPO> </CONF>
Comments on the settings file:
- OAUTH_ENDPOINT corresponds to the URL which provides an OAUTH token (see ORDS doc)
- ID is the Client ID value for Client Credentials
- SECRET is the value of Customer Secret
- ENDPOINT: REST module URL
- VERBOSE: display or not a detailed log
- DEBUG: Display a detailed log
- REPO: optional. REST handler that provides the content of a configuration based on a name passed in the URL. Overrides individual xxx.conf configuration files
Functional tests
I took as a sample the one supplied with the APEX installations and which installs tables corresponding to order management.
A collection of customers, stores, products, and orders. This dataset includes JSON data for the product description, and longitude / latitude for the stores.
This is sufficient to illustrate common cases.
Reminders for installing this dataset
Sql workshop / Utilities / Sample Datasets.
Choisir Customer Orders.
Samples provided with Woords
The following configurations are ready to use for demo/training purpose
- helloworld (basic message)
- invoice (standard invoice with a master/detail
- order (standard purchase order with a master/detail)
- emp (standard emp table )
- cols (use for report )
Designing a Word template suitable for merging
To test my application, I used a standard template offered in Word.
It’s called Professional Sales Invoice and it’s included in the Business Blue Design business template series.
The dynamic fields are represented by Content Controls.
- Give a tag name to the CC corresponding to a XML tag or to a Json node to be displayed..
- Implement the multi-line part, corresponding to the order lines, using a Repeating Content Control (RCC) and incorporating other Content Controls (CC), one per column.
- Set the mapping between a XML tag or JSON node and the CC by using the Tag attribute (cf below).
The tag name in RCC must be the same as the XML « father » tag. (ORDER_ITEMS_FOP_V
in my example)
Repeating Content Control setup
Content Controls types to be used for repeating sections
When adding a Content Control inside a repeating section, it is important to note that the oblong symbol designating the CC must be light blue and not with a pictogram of arrows pointing to the right or the left. If this second case occurs (which I consider to be an anomaly, but for which I have no explanation), an additional CC must be created just before. If we leave things as is (with « arrows »), we lose the formatting and alignment that we could have given to CCs containing numbers.
Conditional displaying
It is possible to condition the display of the content of an RCC according to the presence or absence of data.
For example, if there are no direct reports at the level of a manager, we neutralize the display of the corresponding RCC. To do this, you must encapsulate the RCC in another RCC with the same tag name but prefixed with the “?” sign.
If the RCC tag = direct_reports, the RCC container tag will be: ?direct_reports
Producing REST data
JSON based on a collection Query
XML Data input
I have already developed a tool to produce SQL queries that generate XML. The constraint is that I cannot use a Collection Query handler. Instead, I have to use a Media Resource query, which results in the loss of pagination elements.
The XML results generated by these queries can also be converted to JSON. Thus, a Media Resource handler with a MIME type of application/json can be used. Since vba-json is case sensitive, it is better to generate column aliases in lowercase.
PDF files and logs
Each Woords session generates a random label that is used to create a folder in the pdf directory and to create a log file in the logs directory.
It is the user’s responsibility to purge these directories and files.
Performances
Launching 1 to 5 Word instances. During the test with 5 instances, one ended with an error and a second one stopped after two documents. Only three threads survived.
The ramp-up was two seconds between each launch. Conclusion: the concurrency level is a maximum of three. Beyond that, there are risks of abrupt termination.
The maximum throughput is approximately 1.7 PDF documents per second.
Annexes
Module ORDS pour la gestion des commandes
The module gpm.demo.sales is protected by a privilege named orders.priv.
One must create a CLIENT_CREDENTIAL client and get back the Client id and Client secret. These informations must be stored in the settings.xml file.
ORDERS data model
Vues ajoutées
In order to add formating and computing, it’s better to minimize the code in the REST handler code, and move this task to a sql view level.
ORDER_ITEMS_FOP_V
This view calculates the amounts for each order line
CREATE OR REPLACE FORCE EDITIONABLE VIEW "ORDER_ITEMS_FOP_V" ("ORDER_ID", "PRODUCT_ID", "PRODUCT_NAME", "QUANTITY", "UNIT_PRICE", "MNT",
FOREIGN KEY ("ORDER_ID")
REFERENCES "ORDERS" ("ORDER_ID") DISABLE) AS
select order_id, i.product_id, product_name,
quantity, i.unit_price ,(quantity * i.unit_price) mnt
from order_items i,
products p
where i.product_id = p.product_id;
ORDER_FOOTER_FOP_V
This view provides the global amounts for the footer.
CREATE OR REPLACE FORCE EDITIONABLE VIEW "ORDER_FOOTER_FOP_V" ("ORDER_ID", "TOTAL", "TVA", "TTC", "SHIPPING",
FOREIGN KEY ("ORDER_ID")
REFERENCES "ORDERS" ("ORDER_ID") DISABLE) AS
select order_id, sum(quantity * i.unit_price) total,
round(sum(quantity * i.unit_price * 0.2),2) tva,
round(sum(quantity * i.unit_price * 1.2),2) ttc,
100 as "shipping"
from order_items i,
products p
where i.product_id = p.product_id
group by order_id;
XML statement giving orders
In the sample module, see the ordersxml
handler
JSON Collection Query
In order to keep the advantages provided by the use of a query collection (pagination, among others), we can create a handler with a main query and secondary links to be able to search the data via an additional REST call.
select CUSTOMERS.CUSTOMER_ID as CUSTOMER_ID,
CUSTOMERS.FULL_NAME as FULL_NAME,
CUSTOMERS.EMAIL_ADDRESS as EMAIL_ADDRESS,
ORDERS.ORDER_DATETIME as ORDER_DATETIME,
ORDERS.ORDER_ID as ORDER_ID,
'../orders/' || ORDERS.ORDER_ID || '/lines' AS "$order_lines"
from
ORDERS ORDERS,
CUSTOMERS CUSTOMERS
where CUSTOMERS.CUSTOMER_ID = ORDERS.CUSTOMER_ID
and ORDER_ID = :id
Pour la technique d’insertion d’un lien relatif, cf article Oracle-Base.
Ajout d’un client OAUTH
BEGIN
OAUTH.create_client(
p_name => 'word_client',
p_grant_type => 'client_credentials',
p_owner => 'GPM Factory',
p_description => 'A client Office Word for Orders printing',
p_support_email => 'pmo@email.com',
p_privilege_names => 'orders.priv'
);
COMMIT;
END;
Starting woords in command line
Enter bin\woords
Available Word selectors used:
- /q Word trigger in silent mode
- /p selector not used by Word and added to allow passing a parameter when launching Word which will be used by a VBA macro. The character string which immediately follows the /p selector responds to a precise syntax, specific to woords and which is indicated earlier in this document.
- /z Opens a new document based on a Word template (woords.dotm) and TRIGGERS a possible macro named Autonew.
Documents samples
Purchase order from TemplateLAB site
Invoice from Microsoft templates library
EMP table with two childs
The following sample illustrates the rendering of a master with two detail parts:
- Direct reports
- Other people working in the same department
cf emp template and /emp/handler