Skip to content
Print

Working with Scala’s XML Support

24
May
2010

XML is probably one of Scala’s most controversial language features (right behind unrestricted operator overloading). On the one hand, it’s very nice to be able to simply embed XML fragments and XPath-like expressions within your Scala source code. At least, it’s certainly a lot nicer than the string-literal approach that is required in many other languages. However, XML literals also complicate the syntax tremendously and pose endless difficulties for incremental syntax-aware editors such as IDEs.

Irrespective of the controversy though, XML literals are part of the language and they are here to stay. Martin Odersky has mentioned on multiple occasions that he half-regrets the inclusion of XML literal support, but he can’t really do anything about it now that the language has taken hold and the syntax has solidified. So, we may as well make the best of it…

Unfortunately, Scala’s XML library is very…weird. Especially in Scala 2.7. The class hierarchy is unintuitive, and there are odd pitfalls and correctness dangers just waiting to entrap the unwary. That fact, coupled with the lack of appropriate documentation in the language specification, leads to a very steep learning curve for new users. This is quite unfortunate, because a solid understanding of Scala’s XML support is vital to many applications of the language, most notably the Lift web framework.

I can’t personally do anything about the strangeness in the XML library. Like the literal syntax itself, it’s too late to make many fundamental changes to the way XML works in Scala. However, I can try to make it easier for beginners to get up and running with Scala’s XML support.

The Hierarchy

Before we get to literals and queries, it’s important to have some idea of the shape of Scala’s XML library and how its class hierarchy works. I found (and find) this to be the most unintuitive part of the entire ordeal.

s6EW-5XuGuUAjHDi-zmvofQ.png

There are actually more classes than just this (such as Document, which extends NodeSeq, and Unparsed, which extends Atom), but you get the general idea. The ones I have shown are the classes which you are most likely to use on a regular basis.

Starting from the top, NodeSeq is probably the most significant class in the entire API. The most commonly used methods in the library are defined in the NodeSeq class, and most third-party methods which work with XML usually work at the level of NodeSeq. More specifically, NodeSeq defines the \\ and \ methods, which are used for XPath selection, as well as the text method, which is used to recursively extract all text within a particular set of nodes. If you’re familiar with libraries like Nokogiri, you should be right at home with the functionality of these methods.

One particularly useful aspect of Scala’s XML library is the fact that NodeSeq extends Seq[Node]. This means that you can use standard Scala collections operations to fiddle with XML (map, flatMap, etc). Unfortunately, more often than not, these methods will return something of type Seq[_], rather than choosing the more specific NodeSeq when possible. This is something which could have been solved in Scala 2.8, but has not been as of the latest nightly. Until this design flaw is rectified, the only recourse is to use the NodeSeq.fromSeq utility method to explicitly convert anything of type Seq[Node] back into the more specific NodeSeq as necessary:

val nodes: Seq[Node] = ...
val ns: NodeSeq = NodeSeq fromSeq nodes

Immediately deriving from NodeSeq is another landmark class in the Scala API, Node. At first glance, this may seem just a bit weird. After all, Node inherits from NodeSeq which in turn inherits from Seq[Node]. Thus, a single Node can also be viewed as a NodeSeq of length one, containing exactly itself. Yeah, that one took me a while…

Everything in the Scala XML library is a NodeSeq, and almost everything is a Node. If you remember this fact, then you understand the entire API. The Elem class represents a single XML element with associated attributes and a child NodeSeq (which may of course be empty). The Group class is a bit of a hack and should never be used directly (use NodeSeq.fromSeq instead).

Of the SpecialNode hierarchy, only Atom deserves any special attention, and of its children, Text is really the most significant. Text is simply the way in which the Scala XML library represents text fragments within XML. Clearly, XML elements can have textual content, but since the child(ren) of an Elem have to be Node(s), we need some way of wrapping up a text String as a Node. This is where Text comes in.

It is worth noting that the Atom class actually takes a single type parameter. Text inherits from Atom[String]. I find this aspect of the API just a bit odd, since there aren’t any subclasses of Atom which inherit from anything other than Atom[String], but that’s just the way it is.

Literals

Now that we’ve got the fundamental class hierarchy out of the way, it’s time to look at the most visible aspect of Scala’s XML support: XML literals. Most Scala web frameworks tend to make heavy use of XML literals, which can be a bit annoying due to the difficulties they cause most editors (I’m still trying to get the jEdit support nailed down). Even still, XML literals are a very useful part of the language and almost essential if you’re going to be working with XML content.

Fortunately, Scala’s XML syntax is as intuitive to write as it is difficult to parse:

val ns = <span id="foo"><strong>Hello,</strong> World!</span>
println(ns.toString)      // prints the raw XML

The thing to remember is that any time text appears after the < operator without any whitespace, Scala’s parser will jump into “XML mode”. Thus, the following code is invalid, even though it seems like it should work:

val foo = new {
  def <(a: Any) = this
}
 
foo <foo          // error!

<rant>This is yet another example of Scala’s compiler behaving in strange and unintuitive ways due to arbitrary resolution of ambiguity in the parser. The correct way to handle this would be for the parser to accept the local ambiguity (XML literal vs operator and value reference) and defer the resolution until a later point. In this case, the final parse tree would be unambiguous (there is no way this could correctly parse as an XML fragment), so there’s no danger of complicating later phases like the type checker. Unfortunately, Scala’s parser (as it stands) is not powerful enough to handle this sort of functionality. *sigh*</rant>

Scala’s XML literal syntax is actually sugar for a series of Elem and Text instantiations. Specifically, Scala will parse our earlier example as the following:

val ns = Elem(null, "span", new UnprefixedAttribute("id", Text("foo"), Null), TopScope, 
  Elem(null, "strong", Null, TopScope, Text("Hello,")), Text(" World!"))
 
println(ns.toString)

You will notice that the attribute value is actually wrapped in a Text node. This is necessary because attributes can be returned from XPath selectors, which always return values of type NodeSeq. Thus, the content of an attribute must be of type Node. Unfortunately, this opens up a rather obvious hole in the type safety of the API: the compiler will allow you to store any Node within an attribute, including something of type Elem. In fact, you won’t even get an exception at runtime! The following code compiles and runs just fine:

new UnprefixedAttribute("id", <foo/>, Null)

The good news is that you will almost never use UnprefixedAttribute directly, mostly because the API is so clumsy. Most of the time, you will spend your time either consuming pre-baked XML coming in from some external source, or synthesizing it yourself using literals.

Of course, not all XML is fully-known at compile time. In fact, most often XML is just a structured wrapper around some data which is produced dynamically. To that end, Scala provides a convenient syntax for XML interpolation. This makes it possible to construct XML dynamically based on variables and expressions. For example, we might want to make the id attribute of the foo element dynamic based on some method parameter:

def makeXML(id: String) = <span id={ id }><strong>Hello,</strong> World!</span>
 
makeXML("foo")        // => <span id="foo">...</span>

The interpolation syntax is actually fairly generous about what you are allowed to embed. By default, any values within the { ... } markers will first be converted to a String (using its toString method) and then wrapped in a Text before embedding in the XML. However, if the expression within the braces is already of type NodeSeq, the interpolation will simply embed that value without any conversion. For example:

val ns1 = <foo/>
val ns2 = <bar>{ ns1 }</bar>       // => <bar><foo/></bar>

You can even embed something of type Seq[Node] and the interpolation will “do the right thing”, flattening the sequence into an XML fragment which takes the place of the interpolated segment:

val xs = List(<foo/>, <bar/>)
val ns = <baz>{ xs }</baz>          // => <baz><foo/><bar/></baz>

These auto-magical interpolation features are incredibly useful when assembling XML from multiple sources. Their only downside is the fact that the Eclipse IDE v2.7 really struggles with XML literals and interpolated expressions in particular. My recommendation: if you need to work with XML literals, either avoid Eclipse entirely or be careful to wrap all XML literals in parentheses (like this: (<foo><bar/></foo>)). Note that the 2.8 version of the Scala IDE for Eclipse doesn’t impose this requirement.

XPath

Of course, creating XML is really only half the story. In fact, it’s actually much less than that. In practice, most XML-aware applications spend the majority of their time processing XML, not synthesizing it. Fortunately, the Scala XML API provides some very nice functionality in this department.

For starters, it is possible to perform XPath-like queries. I say “XPath-like” because it’s really not quite as nice as XPath, nor as full-featured. Sometimes it takes several chained queries to perform the same action as a single, compound XPath query. However, despite its shortcomings, Scala’s XPath support is still dramatically superior to manual DOM walking or SAX handling.

The most fundamental XML query operator is \ (bear in mind that all XML operators are defined on NodeSeq). This operator applies a given String pattern to the direct descendants of the target NodeSeq. For example:

val ns = <foo><bar><baz/>Text</bar><bin/></foo>
 
ns \ "foo"              // => <foo><bar>...</bar><bin/></foo>
ns \ "foo" \ "bar"      // => <bar><baz/>Text</bar>

As you can see, the most generic pattern which can be fed into the \ operator is simply the name of the element. All XML operators return NodeSeq, and so it’s very easy and natural to chain multiple operators together to perform chained queries.

However, we don’t always want to chain scores of \ operators together to get at a single deeply-nested element. In this case, we might be better served by the \\ operator:

val ns = <foo><bar><baz/>Text</bar><bin/></foo>
 
ns \\ "bar"          // => <bar><baz/>Test</bar>
ns \\ "baz"          // => <baz/>

Essentially, \\ behaves exactly the same as \ except that it recurses into the node structure. It will return all possible matches to a particular pattern within a given NodeSeq. Thus, if a pattern matches a containing element as well as one of its children, both will be returned:

val ns = <foo><foo/><foo>
 
ns \\ "foo"          // => <foo><foo/></foo><foo/>

The NodeSeq returned from the ns \\ "foo" query above actually has two elements in it: <foo><foo/></foo> as well as <foo/>. This sort of recursive searching is very useful for drilling down into deeply nested structures, but its unconstrained nature makes it somewhat dangerous if you aren’t absolutely sure of the depth of your tree. Just as a tip, I generally confine myself to \ unless I know that the node name in question is truly unique across the entire tree.

In addition to simply selecting elements, Scala also makes it possible to fetch attribute names using its XML selectors. This is done by prefixing the name of the attribute with ‘@‘ in the selector pattern:

val ns = <foo id="bar"/>
 
ns \ "@id"        // => Text(bar)

One minor gotcha in this department: the \ always returns something of type NodeSeq. Thus, the results of querying an attribute value are actually of type Text. If you want to get a String out of an attribute (and most of us do), you will need to use the text method:

(ns \ "@id").text         // => "bar"

Take care though that your selector is only returning a single Text node, otherwise invoking the text method will concatenate the results together. For example:

val ns = <foo><bar id="1"/><bar id="2"/></foo>
 
(ns \\ "@id").text          // => "12"

Unlike XPath, Scala does not allow you to query for specific attribute values (e.g. "@id=1" or similar). In order to achieve this functionality, you would need to first query for all id values and then find the one you want:

ns \\ "@id" find { _.text == "1" }        // => Some("1")

Also unlike XPath, Scala does not allow you to query for attributes associated with a particular element name in a single pattern. Thus, if you want to find only the id attributes from bar elements, you will need to perform two chained selections:

ns \\ "bar" \ "@id"

Oh, and one fun added tidbit, Scala’s XML selectors also define a wildcard character, underscore (_) of course, which can be used to substitute for any element name. However, this wildcard cannot be used in attribute patterns, nor can it be mixed into a partial name pattern (e.g. ns \ "b_" will not work). Really, the wildcard is useful in conjunction with a purely-\ pattern when attempting to “skip” a level in the tree without filtering for a particular element name.

Despite all of these shortcomings, Scala’s almost-XPath selectors are still very useful. With a little bit of practice, they can be an extremely effective way of getting at XML data at arbitrary tree depths.

Pattern Matching

What nifty Scala feature would be complete without some form of pattern matching? We can match on String literals, Int literals, and List literals; why not XML?

<foo/> match {           // prints "foo"
  case <foo/> => println("foo")
  case <bar/> => println("bar")
}

As we would expect, this code evaluates and prints foo to standard out. Unfortunately, things are not all sunshine and roses. In fact, pattern matching is where Scala’s XML support gets decidedly weird. Consider:

<foo>bar</foo> match {   // throws a MatchError!
  case <foo/> => println("foo")
  case <bar/> => println("bar")
}

The problem is that when we define the pattern, <foo/>, we’re actually telling the pattern matcher to match on exactly an empty Elem with label, foo. Of course, we can fix this by adding the appropriate to our pattern:

<foo>bar</foo> match {   // prints "foo"
  case <foo>bar</foo> => println("foo")
  case <bar>bar</bar> => println("bar")
}

Ok, that’s a little better, but we rarely know exactly what the contents of a particular node is going to be. In fact, the whole reason we’re pattern matching on this stuff is to extract data we don’t already have, so maybe a more useful case would be matching on the foo element and printing out its contents:

<foo>mystery</foo> match {   // prints "foo: mystery"
  case <foo>{ txt }</foo> => println("foo: " + txt)
  case <bar>{ txt }</bar> => println("bar: " + txt)
}

Ok, that worked, and it used our familiar interpolation syntax. Let’s try something fancier. What if we have text and an element inside our Elem?

<foo>mystery<bar/></foo> match {   // throws a MatchError!
  case <foo>{ txt }</foo> => println("foo: " + txt)
  case <bar>{ txt }</bar> => println("bar: " + txt)
}

Like I said, decidedly weird. The problem is that the txt pattern is looking for one Node and one Node only. The Elem we’re feeding into this pattern has two child Node(s) (a Text and an Elem), so it doesn’t match any of the patterns and throws an error.

The solution is to remember the magic of Scala’s @ symbol within patterns:

<foo>mystery<bar/></foo> match {   // prints "foo: ArrayBuffer(mystery,<bar></bar>)"
  case <foo>{ ns @ _* }</foo> => println("foo: " + ns)
  case <bar>{ ns @ _* }</bar> => println("bar: " + ns)
}

Closer, but still not right. If we were to examine the types here, we would see that ns is actually not a NodeSeq, but a Seq[Node]. This means that even if we weren’t na├»vely printing out our match results, we would still have problems attempting to use XML selectors or other NodeSeq-like operations on ns.

To get around this problem, we have to explicitly wrap our results in a NodeSeq using the utility method mentioned earlier:

<foo>mystery<bar/></foo> match {   // prints "foo: mystery<bar></bar>"
  case <foo>{ ns @ _* }</foo> => println("foo: " + NodeSeq.fromSeq(ns))
  case <bar>{ ns @ _* }</bar> => println("bar: " + NodeSeq.fromSeq(ns))
}

Success at last! Now let’s try some attributes. To make things easier, we’ll pattern match on static values rather than trying to actually extract data:

<foo id="bar"/> match {
  case <foo id="bar"/> => println("bar")      // does not compile!
  case <foo id="baz"/> => println("baz")
}

As the comment says, this snippet doesn’t compile. Why? Because Scala doesn’t support XML patterns with attributes. This is a horrible restriction and one that I run up against almost daily. Even from a strictly philosophical sense, pattern matching should be symmetric with the literal syntax (just like List and the :: operator). We’ve already seen one instance of asymmetry in XML pattern matching (child extraction), but this one is far worse.

The only way to pattern match in an attribute-aware sense is to use pattern guards to explicitly query for the attribute in question. This leads to vastly more obfuscated patterns like the one shown below:

<foo id="bar"/> match {       // prints "bar"
  case n @ <foo/> if (n \ "@id" text) == "bar" => println("bar")
  case n @ <foo/> if (n \ "@id" text) == "baz" => println("baz")
}

This situation is also somewhat confusing when attempting to read code which uses pattern matching and branches on attributes. I’m constantly tripping over this when I look back at even my own code, mostly because it looks for all the world like we’re matching on a foo element with no attributes! Very frustrating.

Oh, and one final added goodie: namespaces. Pattern matching on an unqualified element (e.g. <foo/>) will match not only exactly that element name, but also any namespaced permutations thereof:

<w:gadget/> match {       // prints "gadget"
  case <gadget/> => println("gadget")
}

If you want to match a specific namespace, you need to include it in the pattern:

<w:gadget/> match {       // prints "w:gadget"
  case <m:gadget/> => println("m:gadget")
  case <w:gadget/> => println("w:gadget")
}

In practice, this is actually fairly useful, but it’s still another head-scratcher in the Scala XML design. I know I struggled with this as a beginner, and I can’t imagine it’s that much easier for anyone else.

Concurrency Pitfalls

One thing we (at Novell) learned the hard way is that Scala’s XML library is not thread-safe. Yes, XML literals are immutable, but this alone is not sufficient. Even though the API is immutable (doesn’t provide a way to change an XML literal in-place), the underlying data structures are not. Observant readers will have caught this fact from our pattern matching example earlier, when we mistakenly printed “ArrayBuffer(mystery,<bar></bar>)“.

ArrayBuffer is a little like Scala’s answer to Java’s ArrayList. It’s pretty much the defacto mutable Seq implementation. Under the surface, it’s using an asymptotically-growing dynamic array to store its data, providing constant-time read and append. Unfortunately, like all array-based data structures, ArrayBuffer suffers from volatility issues. Unsynchronized use across multiple threads involving mutation (even copy mutation like the ++ method) can result in undefined behavior.

The good news is that this problem is fixed in Scala 2.8. The bad news is that a lot of people are still stuck on 2.7. For now, the only solution is to ensure that you never access a single XML value concurrently. This either requires locking or extra data copying to ensure that no two threads have the same copy of a particular NodeSeq. Needless to say, neither solution is ideal.

Conclusion

Scala’s XML support is flaky, inconsistent and arguably a bad idea in the first place. However, the fact that it’s already part of the language means that it’s a little late to bring up inherent design flaws. Instead, we should focus on all that’s good about the library, like the convenience of a very straightforward literal syntax and the declarative nature of almost-XPath selectors. I may not like everything about Scala’s XML support — for that matter, I may not like most of Scala’s XML support — but I can appreciate the benefits to XML-driven applications and libraries such as Lift. Hopefully, this brief guide will help you avoid some of the pitfalls and reap the rewards of XML in Scala with a minimum of casualties.

Comments

  1. Dude your blog’s my fave read :)

    I haven’t finished this article yet but are you aware of the inclusion of E4X (ECMA-357) in AS3 (formerly ES4)? I haven’t run into any compiler parsing ambiguities. Adobe’s official Eclipse plugin stumbles through it but IntelliJ does such a good job you can validate its AST in PSI View. E4X even supports implicit filter predicates (not unlike XPath) not mirrored anywhere else in AS3.

    That’s not to say AS3 is a great language (hardly) or E4X holds it own with XPath (barely). I only mean Tamarin is open source and may lend inspiration to parsing Scala XML and IDEs will be increasingly capable of parsing it as well.

    Jon Toland Monday, May 24, 2010 at 8:26 am
  2. There are absolutely ways to create tools which gracefully handle composed languages. Stratego/ASF+SDF is specifically designed to do that sort of thing. The problem (or at least, the road block) is that language composition *demands* a generalized parser, something which people tend to shy away from for one reason or another. With a scannerless generalized parser, embedded XML is really quite easy to handle. Without it, you end up with weird editor glitches and arbitrary and sub-optimal resolution of ambiguity.

    Daniel Spiewak Monday, May 24, 2010 at 8:35 am
  3. Wow just got around to finishing reading. I’ve lamented E4X’s weaknesses relative to real XPath but its implementation in Tamarin feels so powerful by comparison :& Thanks for your writeup!

    Jon Toland Monday, May 24, 2010 at 8:28 pm
  4. This is another inconsistency I found with the “auto-magical interpolation”:

    http://scala-programming-language.1934581.n4.nabble.com/scala-Embedded-expression-types-in-XML-literals-td2000039.html#a2000039

    The basis of that problem has caused me all sorts of annoying behaviour that only shows up at runtime.

    The other thing that I find annoying is the need to pass in null to the Elem constructor/apply function to indicate an empty namespace. I still can’t think of any good reason for the type isn’t specified as Option[String].

    Kristian Domagala Tuesday, May 25, 2010 at 12:13 am
  5. What are the chances that the problems will get fixed in 2.8 or later versions?

    Afaiu the problems you mentioned should be fixable without breaking too much code.

    Considering that people will have to look at their code if they want to go from 2.7 to 2.8 (especially considering collections), i think it wouldn’t be too bad if people would have to do the same for XML in 2.x.

    As long as the transition doesn’t get too hard and the improvements are visible, most poeple won’t have a problem with it.

    One fix: In your first paragraph you mentioned “… unrestricted operator overloading …”. Scala doesn’t have operator overloading, because it has no operators. Although I understand what you want to say I think it creates a false impression, because people will think of C++-style things then.

    Imho the unrestricted method naming is a great thing, even if there will always be people who abuse it. There are also people who invent stupid method names no one does understand … but no one had the idea to restrict the way people name methods because of that.

    PS: Do you have links to the bugs on Trac regarding the XML things?

    steve Tuesday, May 25, 2010 at 4:44 am
  6. @Kristian

    Thanks for the link on the weirdness with XML interpolation. I wasn’t aware of the weirdness surrounding implicit conversions within XML literals, but it certainly isn’t out of character for the library.

    As for the null in the constructor, I couldn’t agree with you more. It’s just yet another API relic from the fact that the XML library (like much of Scala) began its life as a one-off research project.

    @steve

    I think you would be surprised how much code depends on some of the weird and subtle behavior that I outlined. Lift in particular makes heavy use of the XML library. Fixing any of these bugs would have a profound effect.

    With that said, Dave Pollak has stated that he would be all in favor of fixing the XML support, even if it meant he had to work a bit harder to keep Lift in sync. Unfortunately, it’s a little late in the 2.8 release cycle to make those sorts of changes. Paul Phillips has made some improvements, and some improvements were inherited from other areas of the standard library (e.g. the underlying data structure is no longer mutable), but most of the weirdness remains.

    I’ve thought about pushing harder for fixes in this areas, but in a sense, it doesn’t seem worthwhile. The most annoying problem (for me) is the inability to pattern match on attributes. However, fixing that problem would mean that all of the XML patterns (which don’t specify attributes) would suddenly no longer match elements with attributes, meaning that a *lot* of code would break.

    On the whole, I think the XML library is what it is, and it’s too late to fix it. It’s a shame, but there’s not much that can be done without breaking truckloads of code.

    Daniel Spiewak Tuesday, May 25, 2010 at 6:48 am
  7. Dan, I’m happy to see heavy use of the Scala XML library *and* your criticism. I’ve been holding off on fixing XML bugs due to the putative freeze in trunk, but I plan to get back into it as soon as trunk thaws.

    I’d appreciate it if you could spend whatever time you feel is merited and file tickets, comment on tickets, and tell me whether you’re on board with the list of what I consider ought to be the axiomatic goals of Scala-XML going forward:

    http://www.scala-lang.org/node/4510

    Thanks!

    Alex Cruise Tuesday, May 25, 2010 at 10:49 am
  8. @Alex

    That looks like a good list to me! I think if usage consistency could be improved, that alone would be a big win. Add to that fixing pattern matching and full XPath support, and I would be a happy coder.

    I’m not sure that tickets are the right answer; this almost calls for a full SIP. Has one already been started for fixing XML?

    Daniel Spiewak Tuesday, May 25, 2010 at 11:05 am
  9. I think a SIP is definitely in order, to the extent that we plan on making any source-incompatible changes.

    Of course, there are different classes of user, and important changes that are source-compatible with end user code but break frameworks are more likely than not, realistically. But I certainly would prefer to minimize damage to end-user code.

    One tactical approach that’s probably worth considering is to just spin up scala.xml2 and let everyone do their own “forklift upgrade” as they see fit, rather than trying to ride two horses.

    If Paul manages to achieve sanity in the pattern matcher, that will help a lot.

    Alex Cruise Tuesday, May 25, 2010 at 11:12 am
  10. So I’m new to Scala which is introducing me to a host of academic concepts I was previously unaware. I have a couple questions which may sound silly but I’d like to see Scala be “the next C++” across platforms and wanted to provide outsider feedback.

    What’s the status of Java 7’s native XML support? If active how would it play with Scala if at all?

    Scala’s XPath syntax looks muddled to me. I’m not extolling ECMA-357 but its practical usage could lend nuances to Scala’s XML. For instance the string literals especially for attributes look uncomfortable by comparison. Several of Dan’s other criticism’s also are remedied although I don’t know how portable.

    http://dispatchevent.org/roger/as3-e4x-rundown/
    http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-357.pdf

    Forgive any irrelevance :)

    Jon Toland Tuesday, May 25, 2010 at 12:21 pm
  11. Hello Daniel,

    thank you very much for this post. Really a good starting point!

    with best regards

    Mario Monday, May 31, 2010 at 10:57 am
  12. Thanks for a good article.

    I believe you have an error regarding the XPath syntax:

    ns \ “foo” // => …

    What actually happens is:

    scala> val ns = Text
    ns: scala.xml.Elem = Text

    scala> ns \ “foo”
    res31: scala.xml.NodeSeq = NodeSeq()

    scala> ns \ “bar”
    res32: scala.xml.NodeSeq = NodeSeq(Text)

    scala> ns \\ “foo”
    res33: scala.xml.NodeSeq = NodeSeq(Text)

    I.e, the \ method does not return the target NodeSeq (foo), only it’s direct descendants, while \\ includes the target as well.

    Knut Arne Vedaa Thursday, July 29, 2010 at 7:06 am
  13. It seems that writing XML in the comment was not the best idea, as it did not turn up as expected. I guess the phase of the moon was wrong. :) Feel free to delete my previous comment, as the Scala code in it does not make much sense as it stands. Sorry for the inconvenience.

    Knut Arne Vedaa Thursday, July 29, 2010 at 7:13 am
  14. Two questions: 1) Suppose I want a case which matches any tags, with the purpose being to pull out the contents of the matched tags. What I’m really trying to do is escape certain character entities in every text node at any depth in the document. I’m assuming the solution will be recursive, but I’m stuck on how to match the fact that a tag is there and grab the contents without caring what the tag itself is. Or is there a better way to walk through all nodes, modifying the text nodes as we go?

    2) I noticed that the following snippet of code fails to produce anything from the if statement.
    . . .
    {
    if (numDots > 0)

    for (p <- extraParams.reverse)
    yield
    }
    <xsl:with-param . . .

    However, it works fine if I put a }{ immediately before the for statement. I'm guessing that each escape into Scala code {} is evaluated as a block returning a single value. From a Scala perspective that makes some sense, but in this case it wasted a couple hours of my time trying to figure out why I wasn't getting a numdots param in my output.

    What do you think: Bug or Feature?

    Thanks for the excellent write-up. It really moved me past some stumbling blocks.

    Daniel Ashton Thursday, March 17, 2011 at 6:31 pm

Post a Comment

Comments are automatically formatted. Markup are either stripped or will cause large blocks of text to be eaten, depending on the phase of the moon. Code snippets should be wrapped in <pre>...</pre> tags. Indentation within pre tags will be preserved, and most instances of "<" and ">" will work without a problem.

Please note that first-time commenters are moderated, so don't panic if your comment doesn't appear immediately.

*
*