I'm trying to use Process.Start
with redirected I/O to call PowerShell.exe
with a string, and to get the output back, all in UTF-8. But I don't seem to be able to make this work.
What I've tried:
-Command
parameterConsole.OutputEncoding
in both my console application and in the PowerShell script$OutputEncoding
in PowerShellProcess.StartInfo.StandardOutputEncoding
Encoding.Unicode
instead of Encoding.UTF8
In every case, when I inspect the bytes I'm given, I get different values to my original string. I'd really love an explanation as to why this doesn't work.
Here is my code:
static void Main(string[] args)
{
DumpBytes("Héllo");
ExecuteCommand("PowerShell.exe", "-Command \"$OutputEncoding = [System.Text.Encoding]::UTF8 ; Write-Output 'Héllo';\"",
Environment.CurrentDirectory, DumpBytes, DumpBytes);
Console.ReadLine();
}
static void DumpBytes(string text)
{
Console.Write(text + " " + string.Join(",", Encoding.UTF8.GetBytes(text).Select(b => b.ToString("X"))));
Console.WriteLine();
}
static int ExecuteCommand(string executable, string arguments, string workingDirectory, Action<string> output, Action<string> error)
{
try
{
using (var process = new Process())
{
process.StartInfo.FileName = executable;
process.StartInfo.Arguments = arguments;
process.StartInfo.WorkingDirectory = workingDirectory;
process.StartInfo.UseShellExecute = false;
process.StartInfo.CreateNoWindow = true;
process.StartInfo.RedirectStandardOutput = true;
process.StartInfo.RedirectStandardError = true;
process.StartInfo.StandardOutputEncoding = Encoding.UTF8;
process.StartInfo.StandardErrorEncoding = Encoding.UTF8;
using (var outputWaitHandle = new AutoResetEvent(false))
using (var errorWaitHandle = new AutoResetEvent(false))
{
process.OutputDataReceived += (sender, e) =>
{
if (e.Data == null)
{
outputWaitHandle.Set();
}
else
{
output(e.Data);
}
};
process.ErrorDataReceived += (sender, e) =>
{
if (e.Data == null)
{
errorWaitHandle.Set();
}
else
{
error(e.Data);
}
};
process.Start();
process.BeginOutputReadLine();
process.BeginErrorReadLine();
process.WaitForExit();
outputWaitHandle.WaitOne();
errorWaitHandle.WaitOne();
return process.ExitCode;
}
}
}
catch (Exception ex)
{
throw new Exception(string.Format("Error when attempting to execute {0}: {1}", executable, ex.Message),
ex);
}
}
I found that if I make this script:
[Console]::OutputEncoding = [System.Text.Encoding]::UTF8
Write-Host "Héllo!"
[Console]::WriteLine("Héllo")
Then invoke it via:
ExecuteCommand("PowerShell.exe", "-File C:\\Users\\Paul\\Desktop\\Foo.ps1",
Environment.CurrentDirectory, DumpBytes, DumpBytes);
The first line is corrupted, but the second isn't:
H?llo! 48,EF,BF,BD,6C,6C,6F,21
Héllo 48,C3,A9,6C,6C,6F
This suggests to me that my redirection code is all working fine; when I use Console.WriteLine
in PowerShell I get UTF-8 as I expect.
This means that PowerShell's Write-Output
and Write-Host
commands must be doing something different with the output, and not simply calling Console.WriteLine
.
I've even tried the following to force the PowerShell console code page to UTF-8, but Write-Host
and Write-Output
continue to produce broken results while [Console]::WriteLine
works.
$sig = @'
[DllImport("kernel32.dll")]
public static extern bool SetConsoleCP(uint wCodePageID);
[DllImport("kernel32.dll")]
public static extern bool SetConsoleOutputCP(uint wCodePageID);
'@
$type = Add-Type -MemberDefinition $sig -Name Win32Utils -Namespace Foo -PassThru
$type::SetConsoleCP(65001)
$type::SetConsoleOutputCP(65001)
Write-Host "Héllo!"
& chcp # Tells us 65001 (UTF-8) is being used
This question is related to
powershell
encoding
utf-8
character-encoding
io-redirection
Set the [Console]::OuputEncoding
as encoding whatever you want, and print out with [Console]::WriteLine
.
If powershell ouput method has a problem, then don't use it. It feels bit bad, but works like a charm :)
Not an expert on encoding, but after reading these...
... it seems fairly clear that the $OutputEncoding variable only affects data piped to native applications.
If sending to a file from withing PowerShell, the encoding can be controlled by the -encoding
parameter on the out-file
cmdlet e.g.
write-output "hello" | out-file "enctest.txt" -encoding utf8
Nothing else you can do on the PowerShell front then, but the following post may well help you:.
Spent some time working on a solution to my issue and thought it may be of interest. I ran into a problem trying to automate code generation using PowerShell 3.0 on Windows 8. The target IDE was the Keil compiler using MDK-ARM Essential Toolchain 5.24.1. A bit different from OP, as I am using PowerShell natively during the pre-build step. When I tried to #include the generated file, I received the error
fatal error: UTF-16 (LE) byte order mark detected '..\GITVersion.h' but encoding is not supported
I solved the problem by changing the line that generated the output file from:
out-file -FilePath GITVersion.h -InputObject $result
to:
out-file -FilePath GITVersion.h -Encoding ascii -InputObject $result
Source: Stackoverflow.com