Identifying Multi-Language Online Documents (Part 2)

by Nov 29, 2022

How can you automatically check the supported languages for an online document?

Provided the URL uses a language ID, it’s easy to create a list of URLs with all available language IDs. This is what we have done in part 1 so far:

$list = RL -f

In this second part we now identify the URLs in the list that actually exist. Just trying to contact the URL via Invoke-WebRequest isn’t sufficient, though:

$list = RL -f

As it turns out, all URLs are happily served from the Microsoft webserver and return a status “OK” (including the ones that do not really exist):

 
PS> New-SCode  

That’s because the Microsoft webserver (like many others) accepts all URLs at first. Then, internally, the webserver figures out what to do next, and returns a new URL to the browser. This can be the original URL (if such as resource was found by the webserver), or it can be a completely new URL like a universal search site or a customized “not found” notification. The status “OK” means nothing in respect to the validity of a URL.

You can actually see the inner workings by prohibiting automatic redirections. Add the parameters “-MaximumRedirection 0 -ErrorAction Ignore” to Invoke-WebRequest:

$list = RL -f

Now you see how the webserver told the browser that the URL moved to some other place, effectively redirecting the browser to a new URL.

Checking whether a URL exists or not therefore depends on how the specific webserver works. In case of Microsoft, it turns out that valid URLs cause a single redirection whereas invalid URLs involve more redirections. Limiting redirections to 1 differentiates between valid and invalid URLs.

Here is the final solution which also sports a live progress bar.

It displays the available localized online documents in a grid view window, and you can choose one or more to display in your browser. You could as well just take the result in $result, print it to a PDF and submit it to multi-language staff members.

$list = $h.Keys |
  ForEach-Object { $URL -f

 


Twitter This Tip! ReTweet this Tip!