Deserialization of nested XML in .NET

I used .NET XML serialization a lot in a few projects recently and struggled with marking the nested classes with attributes to serialize correctly. How the serialization works seem to be tricky at the first glance. Although I can get it right with trials and errors. That’s not optimal. It’s time to make a summary of the tips and tricks I have learnt through recent projects.

Let’s jump right into it. I have this class that I want to serialize. It contains a collection of Project objects.


public class MyXML {

 public List<Project> projects { get; set; }

 public class Project
 {
  public int Id { get; set; }
 }
}

I instantiated the class and added 2 projects in it. It’s going to be serialized to below XML. Note that .NET takes the property name projects as the XML array name and uses the object type Project as array item name.

<MyXML>
 <projects>
   <Project>
    <Id>1</Id>
   </Project>
   <Project>
    <Id>2</Id>
   </Project>
 </projects>
</MyXML>

If we want to change the name of array to ‘Ps‘ and array item to ‘P‘, mark the class like this:


public class MyXML {
// If don't have this or left the array name blank, 
// property name "projects" will be used.
 [XmlArray("Ps")]  
// If don't have this or left the array item name blank, 
// class name "Project" will be used.
 [XmlArrayItem("P")]  
public List<Project> projects { get; set; }

 public class Project
 {
  public int Id { get; set; }
 }
}

This generates a XML like below.

<MyXML>
 <Ps>
   <P>
    <Id>1</Id>
   </P>
   <P>
    <Id>2</Id>
   </P>
 </Ps>
</MyXML>

Now this is posing a problem. Since the name of the array item are specified on the collection, what if we want to change the array item’s name based on the member’s type? This is going to be a common request if we are serializing a generic class that can contain any type of collection member. We need a way to specify the array item’s name on the collection’s member instead of on the collection itself.

So let’s take off the [XmlArrayItem(“P”)] first. Now how do we specify the name on the Project class? One may think marking the Project class with [XmlRoot(“P”)] could achieve this, but in fact XmlRoot only works when the object is serialized as the root element. The correct way is to use [XmlType(“P”)] instead. The following class definition allows the XML to be serialized to the same XML above, but we have the flexibility of specifying the array item name for each type of collection member.


public class MyXML {

 [XmlArray("Ps")]
// We will specify this on the definition of the collection's member object
// [XmlArrayItem("P")] 
 public List<Project> projects { get; set; }

// This is NOT going to work
// [XmlRoot("P")]   
// Use XmlType to specify the array item's name
[XmlType("P")]  
public class Project
 {
  public int Id { get; set; }
 }
}

Up to this point, we have been serializing the projects property as a collection with array name at the top level and array item names at the second level. What if we want to place the element of array item under root level? Can we just “ignore” the property projects and directly serialize the collection members? In other words, how do we serialize to below XML?

<MyXML>
 <P>
   <Id>1</Id>
 </P>
 <P>
   <Id>2</Id>
 </P>
</MyXML>

Marking the property [XmlIgnore] seems to be an intuitive way to do it but it will result in the collection projects not being serialized at all. The correct way is to mark the property with [XmlElement()]. The following class definition generates the above XML we want.

public class MyXML {

// This is NOT going to work.
// [XmlIgnore]  
// This makes sure that the collection's members are serialized directly 
// without being the child elements of a array name element.
 [XmlElement("P")]  
 public List<Project> projects { get; set; }

 public class Project
 {
  public int Id { get; set; }
 }
}

Now we are facing a similar challenge here – how to have member specific name if we want to serialize a generic object? The element name it is serialized to is currently specified on the collection projects side, not on the Project object’s  side, which limits us from having an element name for each type of collection member.

In my research I didn’t find a different place to put the attribute or a different attribute to achieve this. However, this can be achieved by XmlAttributeOverrides which can dynamically overwrite attribute on the run!

So what we want to achieve here is just to override the attribute of property projects to be [XmlElement(“A_Dynamic_Name”)]. The following code snippet demonstrates how to use XmlAttributeOverrides to do this.

XmlAttributeOverrides overrides = new XmlAttributeOverrides();
XmlAttributes attrs = new XmlAttributes();

// This is equivalent to the [XmlElement("Anything")]
attrs.XmlElements.Add(new XmlElementAttribute("Anything")); 

// This tells compiler to find MyXML class 
// and find the property 'projects' and override it with the attributes defined above.
overrides.Add(typeof(MyXML), "projects", attrs);

//Add the overrides to the serializer
XmlSerializer xS = new XmlSerializer(typeof(MyXML), overrides); 

This code generates a temporary class definition like below to be serialized.  This allows us to dynamically get the type name of the collection member and use it as the element name.

public class MyXML {

// [XmlElement("Anything")]
public List<Project> projects { get; set; }

 public class Project
 {
   public int Id { get; set; }
 }

}

And this is serialized to this XML:

<MyXML>
 <Anything>
  <Id>1</Id>
 </Anything>
 <Anything>
  <Id>2</Id>
 </Anything>
</MyXML>

As a side note, this can also be used to override the root element with a little modifications.

 XmlAttributes attrs1 = new XmlAttributes();
 attrs1.XmlRoot = new XmlRootAttribute("AnyRootName");
 overrides.Add(typeof(MyXML), attrs1);