Skip to content
Print

Wicket TextileLabel

8
Jul
2007

Well, first I must apologize for not updating my blog in some time. Loads of interesting (and time consuming) things have been happening recently, specifically related to my employment as well as a rather jam-packed holiday week in the US. On a slightly different (but related) note, I resigned from my full time job and am once again “programmer for hire” (if you’re interested, you can drop me an email: djspiewak [AT] gmail). Thankfully, all the closing details seem to be in order, so I’ll finally have a little more time to devote to this blog as well as to writings on EclipseZone.

In that vein I’ve spent a bit more time with Wicket (now Apache Wicket) recently, and I have to confess even more impressed with it than I was a year ago when I first looked at it. The separation of markup and code is a powerful concept that no other framework seems to have achieved to the same level without massive libraries of custom tags. For those of you who aren’t already familiar with Wicket, it’s an Open Source, component-based web framework. You create pages in Wicket by writing a standard HTML file, adding a wicket:id=”yadayada” attribute to the dynamic elements, then you add the corresponding component instances to the Page instance in code. No Java in your HTML, no HTML in your Java.

One of the things I stumbled upon in my latest run at the Wicket is the limitations of its MultiLineLabel component. MultiLineLabel lets you display large blocks of text with the characters appropriately escaped and all line breaks converted to proper <p></p> blocks. It’s not a complicated component, and this is glaringly obvious if you actually need something a bit more substantial.

The site I’m using to experiment has a need for large blocks of (preferably) formatted text. At first, I figured I’d just throw a MultiLineLabel up and call it done. However, the need for formatting seemed a bit more pressing, so I began to look at alternatives. And it occurred to me that perhaps the simplest way to enter formatted text is to use Textile. Unfortunately, this means I need some way to render Textile into HTML within Wicket.

After some Googling, I was able to positively ascertain that there are no Wicket components which provide Textile rendering. Not being one to give up there, I decided to roll my own. After all, one of Wicket’s major selling points is that it makes custom components dead easy, right?

Well, seems the hype is justified in this area too. Although, I must admit the documentation is sorely lacking in this area. I ended up cracking open the source for MultiLineLabel (which was surprisingly readable) and discovering that the key is to override the onComponentTagBody method. With a little more Googling, I found PLextile, which is the most complete Java Textile rendering library. A few minutes of quick hacking later, and I came up with this:

protected void onComponentTagBody(MarkupStream markupStream, ComponentTag openTag) {
    String text = (String) getModel().getObject();
    if (text == null) {
        text = "";
    }
 
    replaceComponentTagBody(markupStream, openTag, postTextilize(new TextParser().parseTextile(text, true)));
}

PLextile handles all the heavy lifting here, and parses out just about everything just fine. It’s main stumbling point is how it handles line breaks. According to Why’s comprehensive Textile reference, line breaks should be handled by wrapping the sections into <p></p> blocks, whereas PLextile was inserting <br/> tags. Quite frustrating, let me tell you.

I considered using RedCloth (Why-the-Lucky-Stiff’s Ruby Textile renderer, wrapped by Rails for the famous textilize method) through JRuby and Java 6 embedded scripting, but it seemed awfully heavy to fire up an entire JRuby interpreter instance just to parse some text, so the decision was made to steer away from that. Instead, I wrote a post-processor for PLextile (hence the postTextilize method in the example above). This method is actually where most of the code is for the component:

private String postTextilize(String textile) {
    textile = "<p>" + textile + "</p>";
    textile = textile.replace("<br />\\r<br />", "</p><p>");
    textile = textile.replace("<br />", "");
 
    return textile;
}

Anyway, wrap it all up into a WebComponent subclass and it’s ready to use in a page. Swap TextileLabel for your former MultiLineLabel usages, and you’re ready to go!

You can download the finished component here.

Comments

  1. Hey, Ben here. I’m the guy behind PLextile — I happened to be Googling it. I was reading the problems you’re having… Two newlines should give you a new paragraph (and you can use other block level tags to change that to a block quote, block code, header, whatever)… Or is my program doing something annoying that I’m missing… Keep me posted, I’d like to ensure that people don’t get too frustrated.

    Cheers, and check out my site ;)

    Ben.

    Ben Monday, July 9, 2007 at 1:21 am
  2. I think it’s doing something weird that you’re missing. :-)

    Two newlines gives me as many tags, no paragraph tags anywhere in sight (or site, as it were). One newline actually gives me a single tag, which is really weird.

    Incidentally, I updated the article to link to PLextile directly, since I totally forgot to do that the first go ’round. :-)

    Daniel Spiewak Monday, July 9, 2007 at 1:41 am
  3. I just Installed Plextile on one of my Windows computers to see if it was an OS related thing, seemed to work fine — So it’s not an OS related bug.

    I’m wondering if:

    String text = (String) getModel().getObject();

    Has the newlines escaped in a way that I might not have predicted (off the top of my head is there a chance it’s using ‘\r’ instead of ‘\n’ or doing any other exciting things), mind taking a peak at the value of ‘text’ before the parse is applied? Keep me posted,

    bencoe [at] gmail.com

    Ben Monday, July 9, 2007 at 3:38 am
  4. you could play with:
    Component.setEscapeModelStrings(final boolean escapeMarkup)

    wicket escapes by default

    Johan Compagner Monday, July 9, 2007 at 4:29 am
  5. @Johan

    But I don’t *want* Wicket to escape the string, I want PLextile to handle that for me. If Wicket escapes the string, it’ll screw up PLextile’s parsing.

    @Ben

    You know, come to think of it there was something interesting going on. I’m fairly sure that the linebreaks in the String I’m passing PLextile are just [\n\r]{2} (the string is created in a text input field, stored in a MySQL database and retrieved using the model, Wicket doesn’t process it at all). However, I noticed that PLextile’s result used \r linebreaks, rather than \n or [\n\r]{2}. You’ll notice I had to account for that in my post-processing code. I thought that was a bit odd…

    Could that be the issue then? If [\n\r]{2} linebreaks are a problem, I can do a little pre-processing and normalize everything to \n, but shouldn’t that be something PLextile tries to do automatically?

    Daniel Spiewak Monday, July 9, 2007 at 11:51 am
  6. Oh, also to make sure we are on the same page… Do the strings actually literally include “\r” and “\n” — Perhaps this is what Johan means by them being automatically escaped… if so.

    text=text.replaceAll(“\\r”,”"+(char)’\r’);
    text=text.replaceAll(“\\n”,”"+(char)’\n’);

    It would be great if the problem was this simple, ;)

    Ben.

    Ben Monday, July 9, 2007 at 12:19 pm
  7. lol :-)

    No, they don’t include the literal strings “\r” or “\n”. I’ve revisited my supporting code a bit, it seems the only place Wicket touches the values is in the data entry, and I can verify (through a SELECT on the db) that the text is correct and unmodified. The model which wraps the entity object (which SELECTs from the DB) is a custom model implementation and does no processing. replaceComponentTagBody doesn’t process the text either, it just spits it to the markup stream.

    The only thing I can think of right now which could be weird is the linebreaks in the input, which could very well be [\n\r]{2} rather than \n. Is this a problem for PLextile?

    Daniel Spiewak Monday, July 9, 2007 at 12:23 pm
  8. PLextile looks for the occurrence of two newline characters in a row “\n\n” to represent a new block… It first strips all ‘\r’ characters though. so “\n\r” shouldn’t be a problem — I myself use a Macbook which uses both characters (if I recall doesn’t windows just use ‘\n’?)… The only thing I can think of that could be a problem is:

    1) The characters are escaped in a weird way, like my previous post suggested.
    2) There is some sort of whitespace between two ‘\n’s perhaps “\n \n”.

    Try parsing this string as a test:

    String test=”h1{color: red}. Hello World\n\np>.I’m a test paragraph.\nstill same paragraph.\n\nh2. Second level header.”;

    I just tried it in the newest build of the parser and get:

    Hello World
    I’m a test paragraph.still same paragraph.
    Second level header.

    Ben.

    Ben Monday, July 9, 2007 at 12:28 pm
  9. Interesting, to keep you posted, my parse ends up with the proper HTML with H1, H2, P, etc. But when I posted it it stripped all the HTML — this might just be how your comments work, but could a similar thing be happening with the other data?

    Good luck,

    Ben.

    Ben Monday, July 9, 2007 at 12:30 pm
  10. Hmm, it does spit out the proper output on my end as well. (wrapped in paragraph tags) It’s sounding more and more like whitespace stuff, but that would be really odd.

    Windows uses \n\r as the linebreak character. I was under the impression MacOS X followed the *nix standard of \n. It was MacOS 9 and lower which used the solo \r char.

    Daniel Spiewak Monday, July 9, 2007 at 12:48 pm
  11. Any more headway on that whitespace issue?

    I thought I’d leave a message and mention that a slightly newer version of PLextile is up on Sourceforge. I’ve been working on a website for an IEEE competition that uses PLextile, and have been finding/fixing quite a few bugs with it.

    Ben.

    Ben Thursday, July 12, 2007 at 5:46 pm
  12. No real headway, unfortunately. My current working theory is that PLextile is handling the \n\r whitespaces wrong (since I can step through the text myself and see that those are the characters coming from the HTML element).

    I’ll try that latest version and see if that solves it. :-)

    Daniel Spiewak Thursday, July 12, 2007 at 8:07 pm
  13. Incidentally, have you considered easing our adoption pain considerably by adding an Ant build file? Or at least a precompiled JAR file. I know it doesn’t take all that much to assemble the JAR ourselves, but it’s still a headache.

    Daniel Spiewak Thursday, July 12, 2007 at 8:10 pm
  14. With the latest version, I still have the same problems.

    Daniel Spiewak Thursday, July 12, 2007 at 8:15 pm
  15. Hi Daniel,

    The link to your TextileLabel is dead. Could you post it again, please?

    ReinoutS Wednesday, September 30, 2009 at 3:46 am
  16. Unfortunately, I’m not sure I even have the source anymore. I’ll take a look to see if I can find it, but I can’t make any promises. Fortunately, it should be pretty easy to recreate the component from the info given in this article.

    Daniel Spiewak Thursday, October 1, 2009 at 2:15 pm
  17. Great post. I was looking for an half an hour for solving to my problem and I have found it here. Great :)

    cheers from Poland!:)

    Dani from Poland Thursday, April 1, 2010 at 3:29 am

Post a Comment

Comments are automatically formatted. Markup are either stripped or will cause large blocks of text to be eaten, depending on the phase of the moon. Code snippets should be wrapped in <pre>...</pre> tags. Indentation within pre tags will be preserved, and most instances of "<" and ">" will work without a problem.

Please note that first-time commenters are moderated, so don't panic if your comment doesn't appear immediately.

*
*