Transformation Rules

Figure 3.6 shows the transformation and adaptation rules that are applied to the GMX web page whenever a WAP-enabled client accesses it. A rule group called gmx.mobile has been defined (see line 32) and it specifies two rules: removeEntities and extractLoginFormAndMenu. Note that the removeEntities transformation rule (lines 2-4) contains a

searchFor

and a

replaceWith

element. These elements provide text-based search-and-replace functionality that allows the usage of regular expressions. In our example, we need it to cut out XML-entities (&..;) that cannot be processed by the Saxon XQuery processor.

The transformation rule called extractLoginFormAndMenus (line 5) is defined by an xquery element. This element has one optional attribute named preTransform (line 6). By setting it to true (which is the default value), the input data is then considered to be not well-formed XML (i.e., HTML) and will be converted to (well-formed) XHTML content (i.e., a process often called tidying) before the XQuery stylesheet is applied. As we did not want to implement the XQuery script directly in the rule-body, we use the import element (i.e., import) which allows the specification of an external XQuery stylesheet file - i.e., gmx2html-table.xql, line 7) to be imported. The complete listing of this script can be found in the appendix (see figure A.3). It is similar to the script used for content-adaption for mobile clients, but it does not only deliver the login form. In addition, it shows some ''important'' menus and produces HTML-code instead of WML-output.

The extractLoginForm-transformation-rule is also implemented as XQuery-script. This time, the xquery element has one child named script (e.g., lines 13-28). This element indicates that the XQuery script is implemented directly in the rule database The XQuery-code (lines 15-26) is implemented quite straightforward: First an HTML-form with the name ''login'' is extracted. Then it is wrapped into a WML card element and finally presented as WML-document (wml!DOCTYPE wml...). Because WAP browsers do check the Content-Type header field and will produce an error message whenever HTML-content is detected, it is required to change the value of this field to indicate that WML-content is delivered (i.e., text/vnd.wap.wml, line 11). Figures 3.5 and 3.4 show screenshots of the transformations as seen on a traditional browser and a WAP phone. Suppose that the information being extracted from the web page is large and needs to be split over a number of smaller pages. In this case, the splitting elements foxy:group and foxy:subgroup are ''inserted'' into the extracted content by means of XQuery or XSL instructions. The layoutPage-element can then be used within the rule implementations to browse between the resulting page splits. Note that in web sites which use a common layout (i.e., corporate identity), FOXY is especially effective because the same HTTP-request pattern and transformation rules can be applied to a large number of web pages.

**Figure 3.4:** The result of the transformation as seen on a traditional PC browser (i.e., transformation gmx.browser)

root 2006-05-22