Use Select-String For Fast Textfile Parsing

by Jun 25, 2013

Select-String is an extremely useful cmdlet for parsing log files. You can use it to dump all lines in a text file that contain a certain keyword. This dumps all lines from windowsupdate.log with the keyword “successfully installed”:

Select-String -Path C:\Windows\WindowsUpdate.log -Pattern 'successfully installed'

You can also ask Select-String to find lines above or below. The next example uses output generated by ipconfig.exe and lists all lines with “IPv4” plus the line following:

And you can use the results in an object-oriented way. This will find all updates recently installed (if you don’t receive anything, then your computer may not have installed any updates lately):

# find all lines with "successfully installed":
Select-String -Path $env:windir\WindowsUpdate.log -Pattern 'successfully installed' |
  ForEach-Object {
    $information = $_ | Select-Object -Property Date, LineNumber, Product
    
    # take line and split it at tabulators
    $parts = $_.Line -split '\t'

    # create Date and Time
    # first tab-separated part contains date
    # second tab-separated part contains time, 
    # take only first 8 characters and omit milliseconds
    [DateTime]$information.Date = $parts[0] + ' ' + $parts[1].SubString(0,8)

    # extract product name which always follows after 'following update: ':
    $information.Product = ($_.Line -split 'following update: ')[-1]
    
    # return custom object
    $information
  } | Out-GridView

The previous approach with Select-String is almost 10x faster than reading the file line by line and doing the filtering manually with Where-Object:

# find all lines with "successfully installed":
Select-String -Path $env:windir\WindowsUpdate.log -Pattern 'successfully installed' |
  ForEach-Object {
    $information = $_ | Select-Object -Property Date, LineNumber, Product
    
    # take line and split it at tabulators
    $parts = $_.Line -split '\t'

    # create Date and Time
    # first tab-separated part contains date
    # second tab-separated part contains time, 
    # take only first 8 characters and omit milliseconds
    [DateTime]$information.Date = $parts[0] + ' ' + $parts[1].SubString(0,8)

    # extract product name which always follows after 'following update: ':
    $information.Product = ($_.Line -split 'following update: ')[-1]
    
    # return custom object
    $information
  } | Out-GridView

ReTweet this Tip!