alexgorbatchev

Friday, June 7, 2013

org.xml.sax.SAXParseException; ... The processing instruction target matching "[xX][mM][lL]" is not allowed.

In an attempt to test the XML output of a Grails service, I created the control string for my xml unit test as follows:
def testXML = '''
<?xml version='1.0' standalone='no'?>
  <!DOCTYPE labels SYSTEM "label.dtd">
  <labels _FORMAT='labelFormat' >
    <label>
      <variable name='study'>my_study</variable>
      <variable name='visit'>my_visit</variable>
    </label>
  </labels>
'''
This generated the following exception:
org.xml.sax.SAXParseException; lineNumber: 2; columnNumber: 9; The processing instruction target matching "[xX][mM][lL]" is not allowed.
  at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:251)
    com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:300)
    org.custommonkey.xmlunit.XMLUnit.buildDocument(XMLUnit.java:383)
    org.custommonkey.xmlunit.XMLUnit.buildDocument(XMLUnit.java:370)
    org.custommonkey.xmlunit.Diff.<init>(Diff.java:101)
    org.custommonkey.xmlunit.Diff.<init>(Diff.java:93)
After some searching I found a StackOverflow question that lead me to the problem. If you specify processing instruction (PI) for your XML in a groovy mulit-line string, you must start the xml content on the first line of the string, because "Whitespace is not allowed between the opening less-than character and the element tagname or between the prefix, colon, and local name of an element or attribute.". So the xml string in the beginning should be:
def testXML = '''<?xml version='1.0' standalone='no'?>
   <!DOCTYPE labels SYSTEM "label.dtd">
   <labels _FORMAT='labelFormat' >
    <label>
     <variable name='study'>my_study</variable>
     <variable name='visit'>my_visit</variable>
    </label>
   </labels>
  '''
or simply add an forward slash after the opening quotes '''/, like
def testXML = '''/
<?xml version='1.0' standalone='no'?>
   <!DOCTYPE labels SYSTEM "label.dtd">
   <labels _FORMAT='labelFormat' >
    <label>
     <variable name='study'>my_study</variable>
     <variable name='visit'>my_visit</variable>
    </label>
   </labels>
  '''

No comments:

Post a Comment