Getting PowerShell Team Blog Topic Headers

by Oct 7, 2010

In a previous tip, you learned how to use RegEx to scrape information from Web pages. It really is just a matter of finding the right "anchors" to define the start and end of what you are seeking. The next code segment reads all PowerShell team blog headers:

$regex = [RegEx]'<span></span>(.*?)</a></h4>'

$url = 'http://blogs.msdn.com/b/powershell/'
$wc = New-Object System.Net.WebClient
$content = $wc.DownloadString($url)

$regex.Matches($content) | Foreach-Object { $_.Groups[1].Value }

Twitter This Tip!
ReTweet this Tip!