C# Parsing XML with namespace using LINQ

Share on:

Overview

Background

In one of my previous posts, I wrote about deserializing XML with namespace using XmlSerializer that requires creating custom model classes in order to perform the serialization. Today, I am going to cover another powerful method for parsing - LINQ to XML.

My Stack

Xml element tag

Xml element tag

  • Visual Studio 2019 Community.
  • .NET Core 3.1 / C#
  • Windows 10 Pro 64-bit (10.0, Build 19041)

Implementation

Simple LINQ-XML reading

Consider the following XML that contains no namespaces.

 1<?xml version="1.0" encoding="utf-8" ?>
 2<Root>
 3  <Items>
 4    <Books>
 5      <Book>
 6        <ISBN>978-1788478120</ISBN>
 7        <Name>C# 8.0 and .NET Core 3.0 – Modern Cross-Platform Development: Build applications with C#, .NET Core, Entity Framework Core, ASP.NET Core, and ML.NET using Visual Studio Code, 4th Edition</Name>
 8        <Price>35.99</Price>
 9      </Book>
10      <Book>
11        <ISBN>978-1789133646</ISBN>
12        <Name>Hands-On Design Patterns with C# and .NET Core: Write clean and maintainable code by using reusable solutions to common software design problems</Name>
13        <Price>34.99</Price>
14      </Book>
15    </Books>
16  </Items>
17</Root>

In order to read the Books elements, we could use the following sample code:

1var root = XElement.Load("SimpleBooks.xml");
2
3var books = root.Descendants("Items").Descendants("Books").Descendants("Book");
4
5foreach (var book in books)
6  Console.WriteLine($"Book name: {book.Element("Name").Value}");

Another approach could be taking an advantage of the anonymous types in .NET. The sample code below reads all the books into an anonymous type containing the 3 elements from the XML as read only properties.

 1// 2. Convert to anonymous type.
 2var books = from book in root.Descendants("Items").Descendants("Books").Descendants("Book")
 3      select new
 4      {
 5        Name = book.Element("Name").Value,
 6        ISBN = book.Element("ISBN").Value,
 7        Price = book.Element("Price").Value
 8      };
 9
10foreach (var book in books)
11  Console.WriteLine($"Book name: {book.Name}");

Simple LINQ-XML reading with namespaces

A quick reminder from the previous article - why do we need namespaces in our XML files?

The short answer would be to prevent any element's naming conflicts in the same file. Remember, XML files can be very long and complex written by different people, so naming conflicts might be very common. A comparable example could be names of the classes in the C# code - once inside namespace the chance for conflict is very low. To create the uniqueness, we usually use URI's that we own, but actually the namespace name can be any string. There are more details in this question regarding URI's and namespaces.

For the sample, I am going to add a namespace to the Books element of the XML.

xmlns="https://gotask.net"

So our XML looks like:

 1<?xml version="1.0" encoding="utf-8" ?>
 2<Root>
 3  <Items xmlns="https://gotask.net">
 4    <Books>
 5      <Book>        
 6        <ISBN>978-1788478120</ISBN>
 7        <Name>C# 8.0 and .NET Core 3.0 – Modern Cross-Platform Development: Build applications with C#, .NET Core, Entity Framework Core, ASP.NET Core, and ML.NET using Visual Studio Code, 4th Edition</Name>
 8        <Price>35.99</Price>
 9      </Book>
10      <Book>
11        <ISBN>978-1789133646</ISBN>
12        <Name>Hands-On Design Patterns with C# and .NET Core: Write clean and maintainable code by using reusable solutions to common software design problems</Name>
13        <Price>34.99</Price>
14      </Book>
15    </Books>
16  </Items>
17</Root>

Running the previous code on this code will produce no results. The reason is that each element has it's own fully qualified name once we have a namespace - the element Books is actually https://gotask.net:Books. and our code is searching for items.Descendants("Books").

In order to correctly parse the file above, we need to specify the namespace using XNamespace class in every call for Descendants.

1XNamespace x = "https://gotask.net";
2
3var books = root.Descendants(x + "Items").Descendants(x + "Books").Descendants(x + "Book");
4
5foreach (var book in books)
6   Console.WriteLine($"Book name: {book.Element(x + "Name").Value}");

Nested namespaces

Consider the following XML, where the Items element is in one namespace, but the Books child element is in other:

 1<?xml version="1.0" encoding="utf-8" ?>
 2<Root>
 3  <Items xmlns="https://gotask.net">
 4    <Books xmlns="https://books.net">
 5      <Book>
 6        <ISBN>978-1788478120</ISBN>
 7        <Name>C# 8.0 and .NET Core 3.0 – Modern Cross-Platform Development: Build applications with C#, .NET Core, Entity Framework Core, ASP.NET Core, and ML.NET using Visual Studio Code, 4th Edition</Name>
 8        <Price>35.99</Price>
 9      </Book>
10      <Book>
11        <ISBN>978-1789133646</ISBN>
12        <Name>Hands-On Design Patterns with C# and .NET Core: Write clean and maintainable code by using reusable solutions to common software design problems</Name>
13        <Price>34.99</Price>
14      </Book>
15    </Books>
16  </Items>
17</Root>

In the sample code below, we need to specify both namespaces.

1XNamespace x = "https://gotask.net";
2
3XNamespace y = "https://books.net";
4
5var books = root.Descendants(x + "Items").Descendants(y + "Books").Descendants(y + "Book");
6
7foreach (var book in books)
8    Console.WriteLine($"Book is {book.Element(y + "Name").Value}");

Multiple namespaces with prefix

XML standard allows us to define multiple namespaces for the same element. Once we define xmlns=https://somename.net, we are actually defining a default namespace without a prefix. In order to define another namespace, we need to specify the prefix xmlns:bk=https://books.net.

In order to create child elements that belongs to https://books.net namespace, we need to declare with <bk:book></bk:book>. Elements without the prefix will belong to the default namespace.

So lets consider this is our new XML. We have 2 namespaces defined, https://gotask.net is the default one and https://books.net has the bk prefix.

We have one Book element in the bk namespace and the other one in the default.

 1<?xml version="1.0" encoding="utf-8" ?>
 2<Root>
 3  <Items xmlns="https://gotask.net" xmlns:bk="https://books.net">
 4    <Books>
 5      <bk:Book>
 6        <bk:ISBN>978-1788478120</bk:ISBN>
 7        <bk:Name>C# 8.0 and .NET Core 3.0 – Modern Cross-Platform Development: Build applications with C#, .NET Core, Entity Framework Core, ASP.NET Core, and ML.NET using Visual Studio Code, 4th Edition</bk:Name>
 8        <bk:Price>35.99</bk:Price>
 9      </bk:Book>
10      <Book>
11        <ISBN>978-1789133646</ISBN>
12        <Name>Hands-On Design Patterns with C# and .NET Core: Write clean and maintainable code by using reusable solutions to common software design problems</Name>
13        <Price>34.99</Price>
14      </Book>
15    </Books>
16  </Items>
17</Root>

The code below, reads only the Books belonging to the bk namespace.

1XNamespace x = "https://gotask.net";
2
3XNamespace b = "https://books.net";
4
5var books = root.Descendants(x + "Items").Descendants(x + "Books").Descendants(b + "Book");
6
7foreach (var book in books)
8  Console.WriteLine($"Book name: {book.Element(b + "Name").Value}");

The code below, reads only the Books belonging to the default namespace.

1XNamespace x = "https://gotask.net";
2
3var books = root.Descendants(x + "Items").Descendants(x + "Books").Descendants(x + "Book");
4
5foreach (var book in books)
6  Console.WriteLine($"Book name: {book.Element(x + "Name").Value}");
  • The full source code available at GitHub.