Text files can be stored using different encodings, and to correctly reading them, you must specify the encoding. That’s why most cmdlets dealing with text file reading offer the -Encoding parameter (for example, Get-Content). If you don’t specify the correct encoding, you are likely ending up with messed up special characters and umlauts.
Yet how do you (automatically) determine the encoding a given text file uses? Here is a handy function that can help:
function Get-Encoding { param ( [Parameter(Mandatory,ValueFromPipeline,ValueFromPipelineByPropertyName)] [Alias('FullName')] [string] $Path ) process { $bom = New-Object -TypeName System.Byte[](4) $file = New-Object System.IO.FileStream($Path, 'Open', 'Read') $null = $file.Read($bom,0,4) $file.Close() $file.Dispose() $enc = [Text.Encoding]::ASCII if ($bom[0] -eq 0x2b -and $bom[1] -eq 0x2f -and $bom[2] -eq 0x76) { $enc = [Text.Encoding]::UTF7 } if ($bom[0] -eq 0xff -and $bom[1] -eq 0xfe) { $enc = [Text.Encoding]::Unicode } if ($bom[0] -eq 0xfe -and $bom[1] -eq 0xff) { $enc = [Text.Encoding]::BigEndianUnicode } if ($bom[0] -eq 0x00 -and $bom[1] -eq 0x00 -and $bom[2] -eq 0xfe -and $bom[3] -eq 0xff) { $enc = [Text.Encoding]::UTF32} if ($bom[0] -eq 0xef -and $bom[1] -eq 0xbb -and $bom[2] -eq 0xbf) { $enc = [Text.Encoding]::UTF8} [PSCustomObject]@{ Encoding = $enc Path = $Path } } }
Here is a test run checking all text files in your user profile:
PS> dir $home -Filter *.txt -Recurse | Get-Encoding Encoding Path -------- ---- System.Text.UnicodeEncoding C:\Users\tobwe\E006_psconfeu2019.txt System.Text.UnicodeEncoding C:\Users\tobwe\E009_psconfeu2019.txt System.Text.UnicodeEncoding C:\Users\tobwe\E027_psconfeu2019.txt System.Text.ASCIIEncoding C:\Users\tobwe\.nuget\packages\Aspose.Words\18.12.0\... System.Text.ASCIIEncoding C:\Users\tobwe\.vscode\extensions\ms-vscode.powers... System.Text.UTF8Encoding C:\Users\tobwe\.vscode\extensions\ms-vscode.powers...
psconf.eu – PowerShell Conference EU 2019 – June 4-7, Hannover Germany – visit www.psconf.eu There aren’t too many trainings around for experienced PowerShell scripters where you really still learn something new. But there’s one place you don’t want to miss: PowerShell Conference EU – with 40 renown international speakers including PowerShell team members and MVPs, plus 350 professional and creative PowerShell scripters. Registration is open at www.psconf.eu, and the full 3-track 4-days agenda becomes available soon. Once a year it’s just a smart move to come together, update know-how, learn about security and mitigations, and bring home fresh ideas and authoritative guidance. We’d sure love to see and hear from you!