Get-Content is designed for reading _text_ files and -Raw makes matters worse in this case by reading the entire file _at once_. The real problem here - a separate issue - is that _PowerShell knows ONLY text_ it lacks support for passing _binary data_ through the pipeline. The last problem is that we seem to tack on ::newline to whatever we push down the pipe (which is causing gunzip to complain about the trailing should only ever apply when sending _text_ to external utilities. Gunzip: (stdin): trailing garbage ignored PS> gc -raw -encoding $enc f.tgz | gunzip | tar tvf. However, if I read the file and set $outputEncoding with encoding set to iso-8859-1, _viola_ it works (sort of)! PS> $outputEncoding = $enc It turns out there's a couple of problems įirst when we read the content, we break up the output into lines, but I can fix that with -raw, but that still doesn't work. Setting $outputEncoding to utf8nobom doesn't do the trick: PS> $outputEncoding = $utf8 Here's an example, from my MacBook I have a compressed tar archive, which I would love to unspool as: I'm not sure that setting outputEncoding to utf8 w/o bom is correct, at least on some platforms. Perhaps a warning on startup is sufficient.īefore the above is implemented, the interim workaround to make a console window / terminal use UTF-8 consistently is the following command: $OutputEncoding = ::InputEncoding = ::OutputEncoding = ::new()Įnvironment data PowerShell Core v6.0.0-beta.6 To be determined: How should the rare event of being invoked from a terminal with a different active character encoding be handled? Changing the encoding on the fly, as on Windows, is not guaranteed to work.On _Unix_ platforms with UTF-8-based locales, which are the norm these days, no action is required.
Conceivably, PowerShell should _automatically_ switch to the 65001 code page in case it is launched from a console window with a different active code page (such as from cmd.exe), though it is worth noting that this change in encoding by default remains in effect until the window is closed (even after exiting PowerShell and returning to cmd.exe perhaps a warning could be issued on startup).On Windows, the Start Menu shortcut that is created during installation should be preconfigured to open a console window with code page 65001. ::OutputEncoding tells PowerShell what encoding to assume when reading output _from_ external utilities. On _Windows_, ::InputEncoding and ::OutputEncoding must both be set to ::new(), which is the equivalent of configuring a console window to use code page 65001 (UTF-8) or executing chcp 65001 _before_ PowerShell is launched. $OutputEncoding tells PowerShell what character encoding to use when sending output _to_ external utilities.Preference variable $OutputEncoding, which currently defaults to ASCII, must default to ::new() (UTF-8 with no BOM), or, perhaps preferably, _not_ predefine this variable and default to that encoding (the internally used default) in its absence. BOM-less UTF-8 character encoding is coming as the default for PowerShell Core on all platforms.