AboutProjects

HTML Dialect 008

Author: Henrik Mikael Kristensen
Date: 22-Aug-2010
Copyright: 2010 - HMK Design
Version: 0.0.8

This is alpha software under development and may have various bugs. Features may change drastically during development.

Introduction

The purpose of the HTML dialect is to produce HTML code using a REBOL dialect. There are multiple reasons for this:

  • Dialect code takes up much less space than HTML and is simpler and easier to write.
  • Fits both static and dynamic, offline and online HTML content generation.
  • Generate HTML from REBOL types like objects and blocks.
  • Use loops for generating HTML from data traversal.
  • It's easier to make dynamic content.
  • Provide standards compliant HTML, no matter the doctype, using the same dialect code.
  • REBOL code all the way. No need to intermix REBOL code with HTML.

Some things that it doesn't do:

  • It doesn't produce CSS or Javascript code.

The HTML Dialect is currently in version 0.0.8 and is released under the BSD license.

Installation

In order to use the HTML dialect, include the html.r file in your code, provided in the download page and you're ready to go! You will know it's loaded when the ctx-html context exists in memory.

Usage

Primarily this is for usage with a webserver, such as Cheyenne, so there is an output buffer (a plain text string) called out-buffer available. When you want to output the content of it, you can do this with print, or by saving it to disk or whatever you want to do.

You generate HTML with the html-gen or the output-html function. This is similar to the layout function in VID in REBOL/View, if you've tried that.

html-gen

The html-gen function performs the parsing of the dialect and fills out-buffer with HTML code. Then it returns TRUE or FALSE, depending on whether the dialect was correctly parsed. Example:

>> html-gen [=== test ["my code"]]
== true
>> out-buffer
== "<test>my code</test>"

Every time you add a word!, string!, block! or other piece of dialect code inside the dialect, it's run through html-gen and appended to the end of out-buffer without spacing.

>> html-gen ["more code"]
== true
>> out-buffer
== "<test>my code</test>more code"

html-gen accepts a variety of datatypes and can therefore be used to generate small bits of HTML code, output a single char or even nothing, if your input is none!. html-gen uses itself extensively inside its own parser for this purpose to minimize code size and allow dialect recursion.

If you want to generate multiple pages in sequence with html-gen, use clear out-buffer between generating pages with html-gen.

Other examples
html-gen "Text"

Will append the following to out-buffer:

Text

This code:

html-gen 'a

Will append the following to out-buffer:

a

This code:

html-gen [tag [shout till noon]]

Will append the following to out-buffer:

<shout till="noon">

output-html

The output-html function wraps html-gen and is useful, if you generate the HTML in one operation, directly returned to the console. Example:

The function performs the following

  1. It first clears the out-buffer
  2. Then generates the HTML into the out-buffer
  3. Then returns the out-buffer. If you need the string elsewhere, you should copy it.
>> output-html [=== test ["my code"]]
== "<test>my code</test>"
>> output-html ["more code"]
== "more code"

Both html-gen and output-html accept none!, string!, tag!, file!, url!, number!, time!, date!, get-word!, word! and block! as input.

For the examples below we will use output-html for simplicity.

HTML Output Notes

Some things about the output:

  • There are never spaces between uses of html-gen, so any spaces that need to be there, must be added by you.
  • The output is always a string.
  • The output may not be very readable, as html-gen does not add newlines or indentations to the HTML code. This is to keep the HTML code as small as possible.

Full Page

Example for generating one full page:

output-html [page "My Page Title" ["This is my Webpage"]]

This produces the following HTML code (indentation used here for clarity, it is not present in the actual output):

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html>
  <head>
    <title>My Page Title</title>
  </head>
  <body>This is my Webpage</body>
</html>

If you want to include a style sheet to the basic page, you can add it as a parameter to the page command:

output-html [page "My Page Title" css style.css ["This is my Webpage"]]

Produces:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html>
  <head>
    <link href="style.css" rel="stylesheet" type="text/css" />
    <title>My Page Title</title>
  </head>
  <body>This is my Webpage</body>
</html>

Tags

Tags are passed as words if they are recognized as HTML tags:

output-html [p "test"]
== "<p>test</p>"

output-html [py "test"]
== "pytest"

The HTML Dialect tracks which tags are valid HTML single-tags, i.e. for example <link />. Single tags are not entered directly as words, as they are usually containing special parameters, which is currently not handled in the dialect.

All tags are possible to enter using tag, which you pass as a block:

output-html [tag [p]]

The block is a full parameter list for a start-tag or a single-tag:

output-html [tag [p class decorative]]
== {<p class="decorative">}

To close tags, use end-tag. The command will end any starting tag, which is not a single-tag:

output-html [tag [p class decorative] end-tag]
== {<p class="decorative"></p>}

It can be used consecutively:

output-html [tag [div] tag [p class decorative] end-tag end-tag]
== {<div><p class="decorative"></p></div>}

When used with a single-tag, end-tag will not produce the correct end-tag:

output-html [tag [link] end-tag]
== "<link /></p>"

Since tag will recognize single-tags, the full-tag command is useful in cases where you need to enter a normally single-tag as a complete start and end-tag. This is useful when creating RSS and ATOM feeds:

A more intuitive way to produce multiple levels of tags is by using the === command:

output-html [=== div [=== [p class []]]

Using this format is currently not possible:

output-html [=== [p class decorative] [content]]

The major differences between the different ways of writing tags:

tagThis begins a tag and the tag name is passed as an argument. This makes it possible to begin the tag in one dialect block and end it in another. tag will print tags as single-tags, if the tag is a valid HTML single-tag. When passing a valid single-tag, the tag is not tracked for end-tag.
full-tagSimilar to tag, except it does not produce single-tags. This is useful, if you wish to use any tag name, disregarding the status of single-tags in the HTML standard. Any tag is always tracked for end-tag.
end-tagEnds any tag that is not a single-tag or any full-tag. Note again, that they don't have to exist in the same dialect block. End-tags are stored in the end-tags block inside the ctx-html context. This block is treated as a FIFO stack. Consecutive calls empties the stack one tag at a time.
===This uses a block as input argument, thus it must always be used in the same HTML dialect block. It provides quicker notation for human generated HTML dialect code.

Tags across Dialect Blocks

Example of cross-block tags and end tag usage:

output-html [tag [this]]
== "<this>"
output-html [end-tag]
== "</this>

For the following example, consider that <br>, <hr> and <img> are in the single-tags block:

Examples:

output-html [tag [p] tag [hr] end-tag]
== "<p><hr /></p>"

output-html [full-tag [p] full-tag [hr] end-tag]
== <p><hr></hr>"

Example of using a valid HTML single tag along with an erroneous end-tag:

output-html [tag [link]]
== "<link />"
output-html [end-tag]
** Script Error: Out of range or past end
** Where: html-gen
** Near: out close-tag either block? last
Other examples
output-html [=== p [=== pre [=== tt [=== a [Hello] ]]]]
== "<p><pre><tt><a>Hello</a></tt></pre></p>"

output-html [
  tag [p] tag [pre] tag [tt] tag [a] "Hello"
  end-tag end-tag end-tag end-tag
]
== "<p><pre><tt><a>Hello</a></tt></pre></p>"

output-html [p [pre [tt [a Hello]]]]
== "<p><pre><tt><a>Hello</a></tt></pre></p>"

End Tag Errors

As tags are tracked across multiple uses of html-gen, ending a tag incorrectly in one use of html-gen will cause subsequent uses of it to contain errors. Furthermore, the HTML dialect can't track if your complete page contains too few end-tags. The only way is to check if ctx-html/end-tags is empty at the end of page generation.

CSS Styles

HTML can be styled using issue!, which represents a tag ID or refinement!, which represents a tag CLASS. Consecutive uses of refinements or issues will add to the number of styles used for the tag. You are also free to combine them.

Examples:

output-html [div #headline "Hello"]
== {<div id="headline">Hello</div>}

output-html [div /text /headline "Hello"]
== {<div id="text, headline">Hello</div>}

output-html [div /text #headline "Hello"]
== {<div class="text" id="headline">Hello</div>}

HTML Doctypes

The HTML dialect supports most available versions of the HTML specs, except for HTML5 at this time. When including the page command on the web page, the !DOCTYPE tag is automatically included at the top of the webpage. Single standing tags in XHTML 1.0 and upwards are always postfixed with a /. The following types are supported:

html-2.0-dtd
html-3.2-dtd
html-4.01-strict
html-4.01-transitional
html-4.01-frameset
xhtml-1.0-strict
xhtml-1.0-transitional
xhtml-1.0-frameset
xhtml-1.0-dtd
xhtml-basic-1.0-dtd
xhtml-basic-1.1-dtd
mathml-1.01-dtd
xhtml-mathml-svg-dtd
svg-1.0-dtd
svg-1.1-full-dtd
svg-1.1-basic-dtd
svg-1.1-tiny-dtd

They are all stored in the doc-types block, which is used in the HTML dialect.

To switch the HTML version, just use it before any code, a <br> tag here used as example:

output-html [html-4.01-strict tag [br]]
== {<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"
   "http://www.w3.org/TR/html4/strict.dtd"><br>}
output-html [xhtml-1.0-strict tag [br]]
== {<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"><br />}

Dynamic Content

The HTML dialect provides powerful methods for inserting dynamic content and shaping HTML from REBOL data. For this section the output-html word is dropped from examples.

Using do

The do command evaluates a block of regular REBOL code inside the block. The last value returned from the block is inserted in the place where the do block exists and is fed into html-gen. This can be both normal values or entire dialect blocks, to provide alternative HTML content when needed in certain circumstances.

Example:

div /now [do [now]]
== <div class="now">22-Jun-2008/04:12:06+2.00</div>

You can use it to provide alternate blocks, say, if your table of blog posts is empty. Example:

div #posts [
  do [
    either empty? posts [[
      "No posts in this blog."
    ]][[
      table rows :posts
    ]]
  ]
]

Using set-word!

The set-word datatype is specifically reserved for reusable blocks of code or single words or values in your dialect code, making your code even smaller. You must refer to the original set-word! as a word!, when you want to recall the content.

hello: [p "Hello World!"]
hello hello

Produces:

<p>Hello World!</p>
<p>Hello World!</p>

The word is stored in an internal block of keys and values in the ctx-html context in the user-words block throughout the lifetime of the context. You can't delete defined words, but you can redefine them. The word is not available globally and defining a global word with the same name will not affect the content of the stored word.

You can redefine the same word throughout the use of the block. Example:

hello: [p "Hello World!"] hello
hello: [p "Goodbye"] hello

Produces:

<p>Hello World!</p>
<p>Goodbye</p>

The do command is also evaluated on each call:

t: [p [do [now wait 1]]]
t t t

Produces:

<p>10-Aug-2008/0:25:56+2:00</p>
<p>10-Aug-2008/0:25:57+2:00</p>
<p>10-Aug-2008/0:25:58+2:00</p>

Caveats

Each block that you feed to a set-word! must make fully sense for html-gen, and that you can't split things like tables in multiple segments. You can use an entire table cell's content, but not separate rows or format blocks.

If you (accidentally) define a word that also qualifies as a tag, such as 'b or 'strong, that word will not be usable. Such an example would merely result in the word being ignored and the appropriate tag would be used instead. Tags have higher priority. Example:

p: [p "Hello"] p

Produces:

<p></p>

You can't delete defined words. They stay in the context throughout its lifetime.

Using get-word!

The use of get-word! types in code will automatically get a word from the global context, or whatever context the dialect code block is bound to at HTML generation time.

a: "my string"
output-html [p [:a]]
== "<p>my string</p>"

get-word! does not mirror set-word!.

If you define a set-word! as mentioned in the section above, this set-word! will not be usable as a get-word!. get-word! only refers to the contexts to which the input block is bound.

Page Composition

You can build a webpage from multiple blocks and then combine them together in a single "super block", which may help you create clearer page compositions. When including a get-word!, it is processed internally with html-gen, and if that word contains some HTML dialect code, it will be processed as that. You can do this even in multiple levels as html-gen is fully recursive.

Example:

header: [div #menu "My header"]
content: [div #body "My body"]
footer: [div #footer "Copyright 2008 Acme Corp"]

output-html [page "My Page" [:header :content :footer]]

Produces:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html>
  <head>
    <title>My Page</title>
  </head>
  <body>
    <div id="menu">My header</div>
    <div id="body">My body</div>
    <div id="footer">Copyright 2008 Acme Corp</div>
  </body>
</html>

Looping

Many code size reductions are done by using built in looping and table generation features. There are various methods:

Loop HTML Code

Basic loops allow you to output the same code N times.

Example:

>> output-html [loop 5 [b 'test]]
== {<b>test</b><b>test</b><b>test</b><b>test</b><b>test</b>}

You can perform even and odd types of HTML code per iteration by specifying alternate and an additional block:

Example:

>> output-html [loop 5 [b 'odd] alternate [b 'even]] 
== {<b>odd</b><b>even</b><b>odd</b><b>even</b><b>odd</b>}

Generate Tables

Tables can be generated purely from objects or blocks or they can be generated entirely in code for static layout purposes.

A table consists of some elements:

  1. A format block, which contains the data as it will be printed row by row if the rows description is an external data block.
  2. A rows block, which contains the data as it will be formatted for each table row. Here you can pass a get-word!, representing an external data block or pass data directly. You can also specify a sub-dialect to create a very specific table layout. There can be several different row descriptions.
  3. Optional usage of even, odd, first, last, odd-last and even-last keywords prior to each row description to tell the HTML dialect when to use these rows.

Example with a 1-dimensional block:

output-html [table /table-style rows [1 2 3]]

Produces:

<table class="table-style" cellspacing="0" cellpadding="0">
  <tr>
    <td>1</td>
  </tr>
  <tr>
    <td>2</td>
  </tr>
  <tr>
    <td>3</td>
  </tr>
</table>

A 2-dimensional block produces more substantial output:

output-html [table /table-style rows [[1 "foo" 3][4 5 "boo"]]]

Produces:

<table class="table-style" cellspacing="0" cellpadding="0">
  <tr>
    <td>1</td>
    <td>foo</td>
    <td>3</td>
  </tr>
  <tr>
    <td>4</td>
    <td>5</td>
    <td>boo</td>
  </tr>
</table>

The table also consists of a sub-dialect to describe each row as a collection of HTML rows and cells.

Many more and deeper examples and options are available in the HTML Dialect Command Reference under section Table.

Traverse Blocks

This is the most versatile method in that you can traverse many different kind of data structures in series! form and output any type of loop-able HTML code.

As a basis, traverse requires 3 elements:

  1. The data, which is usually a block of any type of data.
  2. The using block of words, specifying how to access that data in each iteration.
  3. The output dialect block, which is the code piece to be repeated, applying an entry of data on each loop. Each word from the using block is applied as get-word!.

Example:

data: [[1 2 3][4 5 6]]
fields: [title text date]

>> output-html [
    traverse :data using :fields
      [div /title :title div /text :text div /date :date]
   ]

Produces:

<div class="title">1</div><div class="text">2</div><div class="date">3</div>
<div class="title">4</div><div class="text">5</div><div class="date">6</div>

More and deeper examples are available in the HTML Dialect Command Reference under section Traverse.

Forms

The HTML dialect poses some intentional limitations on forms for simplifying the form system. This serves as a basis for future use with AJAX communication:

  • All HTML dialect forms send via the POST method. This will not change.
  • There are no per-field settings yet, such as maximum field size. This is planned for change later.
  • There is no scheme yet for form validation, neither server- nor client-side. This is planned for change later.

There are several ways to create and manage forms.

The Straight Forward Method

You add form elements and a submit button, and then when you submit the form, the server receives them via POST.

A form is created by stating the form action, name and default input values through an optional object. The form code is enveloped in a block. Each field element describes its associated name as a word! value.

This method is fine for example for a PHP server side solution, if you are creating a form to be used on a completely static HTML page.

Example:

form submit.rsp [
  div #label "Name" [field name]
  div #label "Address" [field address]
  div [submit "Submit Form"]
]

Produces:

<form action="submit.rsp" method="POST">
  <div class="label">
    Name
    <input type="field" name="name" value="" />
  </div>
  <div class="label">
    Address
    <input type="field" name="address" value="" />
  </div>
  <div>
    <input type="submit" name="Submit Form" />
  </div>
</form>

The names used for each field in the form are the same as those used in the dialect.

If you want default data to be put in the form or expect to revisit the form in a validation process through REBOL and/or Cheyenne, you need a different method.

Using the Form Object

If you use a get-word! in the form specifications, you will be able to attach an external object to the form. The object must already exist with the required content. This is a better method than the above one, if you desire to recreate the form with its existing data in the same REBOL runtime environment that the form was originally generated from, or you wish default data to be put in the form.

While the job of the HTML dialect finishes when rendering the HTML code, it means you can essentially tie form data to a fixed object that simply updates its values when you submit form data to the server, granted that you must write this part yourself. The ctx-html internal value is form-object, which is by default none.

The added benefit is that the HTML dialect will let you auto-refill the form with the stored values in the object, when the page needs to be rendered again. The form object is stored internally in the ctx-html context as form-object.

Example, showing a pre-existing form object, that has the same words as used in the form:

form-data: make object! [
  name: "Luke Lakeswimmer"
  address: "Tatooine Rebol Base"
]

And in the form dialect code, we include form-data:

form submit.rsp :form-data [
  div #label "Name" [field name]
  div #label "Address" [field address]
  div [submit "Submit Form"]
]

Produces:

<form action="submit.rsp" method="POST">
  <div class="label">
    Name
    <input type="field" name="name" value="Luke Lakeswimmer" />
  </div>
  <div class="label">
    Address
    <input type="field" name="address" value="Tatooine Rebol Base" />
  </div>
  <div>
    <input type="submit" name="Submit Form" />
  </div>
</form>

You can also create the form-data object inline in the dialect code or as a block of key/value pairs.

Future

As this is a very early development version of the dialect a lot of features are missing:

  • CSS Dialect to simplify creation of CSS content.
  • Further automatization and simplification of form creation and management.
  • Stylize function similarly to VID, for customized tags with one word.