Choosing Best File Format (Part 3)

by Jul 5, 2023

Publishing on – Thur June 22

PowerShell supports a wide variety of text file formats, so what’s the best way to save and read data?

In the first two parts of this series we outlined a practical guide to choose the best file format (and appropriate cmdlets) based on the nature of your data.

When you decide to use XML as a data format, you found that the built-in cmdlets Export/Import-CliXml are simple ways to save *your own objects* to XML files and vice versa. But how can you deal with XML data from sources that you did not create yourself? Let’s take a look at cmdlets with noun “Xml”: ConvertTo-Xml. It converts any object(s) into XML format:

 
PS> Get-Process -Id $pid | ConvertTo-Xml

xml                            Objects
---                            -------
version="1.0" encoding="utf-8" Objects  

The result is XML and makes sense only when you store it in a variable so you can examine the XML object and i.e., output the XML string representation:

 
PS> $xml = Get-Process -Id $pid | ConvertTo-Xml
PS> $xml.OuterXml
<?xml version="1.0" encoding="utf-8"?><Objects><Object Type="System.Diagnostics.Process"><Property Name="Name"
 Type="System.String">powershell_ise</Property><Property Name="SI" Type="System.Int32">1</Property><Property N
ame="Handles" Type="System.Int32">920</Property><Property Name="VM" Type="System.Int64">5597831168</Property><
Property Name="WS" Type="System.Int64">265707520</Property><Property Name="PM" Type="System.Int64">229797888</
Property><Property Name="NPM" Type="System.Int64">53904</Property><Property Name="Path" Type="System.String">C
:\WINDOWS\system32\WindowsPowerShell\v1.0\PowerShell_ISE.exe</Property><Property Name="Company" Type="System.S
tring">Microsoft Corporation</Property><Property Name="CPU" Type="System.Double">3,984375</Property><Property 
Name="FileVersion" Type="System.String">10.0.19041.1 (WinBuild.160101.0800)</Property><Property Name="Produc... 

Even though there is no Export-Xml, you can easily create your own Export-CliXml that persists objects to file without using the proprietary “CliXml” structure:

# data to be persisted in XML:
$Data = Get-Process | Select-Object -First 10 # let's take 10 random processes,
                                              # can be any data
# destination path for XML file:
$Path = "$env:temp\result.xml"

# take original data
$Data | 
    # convert each item into an XML object but limit to 2 levels deep
    ConvertTo-Xml -Depth 2 | 
    # pass the string XML representation which is found in property
    # OuterXml
    Select-Object -ExpandProperty OuterXml |
    # save to plain text file with appropriate encoding
    Set-Content -Path $Path -Encoding UTF8 

notepad $Path

To go the opposite route and turn XML back into objects, there is no ConvertFrom-Xml – because this functionality is already built into the type [Xml].

To convert back the sample file from above into objects, here is what you could do (provided you created the file result.xml with the sample code above):

# path to XML file:
$Path = "$env:temp\result.xml"

# read file and convert to XML:
[xml]$xml = Get-Content -Path $Path -Raw -Encoding UTF8 

# dive into the XML object model (which happens to start
# in this case with root properties named "Objects", then
# "Object":
$xml.Objects.Object |
  # next, examine each object found here:
  ForEach-Object {
    # each object describes all serialized properties
    # in the form of an array of objects with properties
    # "Name" (property name), "Type" (used data type), 
    # and "#text" (property value). 
    # One simple way of finding a specific entry
    # in this array is to use .Where{}:
    $Name = $_.Property.Where{$_.Name -eq 'Name'}.'#text'
    $Id = $_.Property.Where{$_.Name -eq 'Id'}.'#text'
    $_.Property | Out-GridView -Title "Process $Name (ID $Id)"
  }

This piece of code reads (any) XML file and translates the XML to objects. You can use this template to read in and process virtually any XML file.

That said, to make use of the data you need to understand its internal structure. In our example, we “serialized” 10 process objects. As turns out, Convert-Xml saves these objects by describing all of its properties. The code above illustrates how you first get to the serialized objects (found in .Objects.Object), then how to read the property information (found in .Property as an array of objects, one per property).

 


Tweet this Tip! Tweet this Tip!