xml – 使用外语字符处理文本文件或字符串变量

我想使用VBA函数,例如LCase $()和下一个UCase(),我的test.xml文件是UTF-8编码的.下面的示例代码使用UTF-8内容加载文件:

Dim objFileSystem, objInputFile

Set objFileSystem = CreateObject("Scripting.fileSystemObject")
Set objInputFile = 
objFileSystem.OpenTextFile("c:\test.xml", 1)

inputData = objInputFile.ReadAll

objInputFile.Close

现在我尝试将内容转换为小写,然后将第一个字母更改为大写

Var = inputData
Var = LCase$(Var)

Select Case Len(Var)

Case 0
CapitilizeFirstLetter = ""

Case 1
CapitilizeFirstLetter = UCase(Var)

Case Else
CapitilizeFirstLetter = Ucase(Left(Var, 1)) & mid(Var, 2)

End Select

我尝试以名称test_edited.xml保存文件内容

FileUrl = "c:\test_edited.xml"

Set objStream = CreateObject("ADODB.Stream")
With objStream
    .Open
    .Charset = "utf-8" 'rosyjski iso-8859-5
    .Position = objStream.Size
    .WriteText=Var
    .Flush
    .Position = 0
    .Type     = 1 'binary
    .Read(3)      'skip BOM
    .SaveToFile FileUrl,2
    .Close
End With
Set objStream = Nothing

结果,第一个文件的内容是:

Nejznámější ŽENY, MODELY, herečka, zpěvačka

第二个是现在

Nejznámější ŽENY, MODELY, herečka, zpěvačka

而且我预计它会是这样的

Nejznámější ženy, modely, herečka, zpěvačka

我究竟做错了什么?

我正在使用Basic IDE ver 6.4.

所有代码应如下所示

Sub Main

'getting variable from outside
ChanNum = DDEInitiate("MacroEngine", "MacroGetVar")
Var$= DDERequest$(ChanNum, "vChannelOpisA")
    DDETerminate ChanNum


Var = LCase$(Var) ' converting utf-8 encoded string to lower case

'change first letter to upper case
Select Case Len(Var)

Case 0
CapitilizeFirstLetter = ""

Case 1
CapitilizeFirstLetter = UCase(Var)

Case Else
CapitilizeFirstLetter = Ucase(Left(Var, 1)) & mid(Var, 2)

End Select

'sending variable to outside of vb script
ChanNum = DDEInitiate("MacroEngine","MacroSetVar")
Var = "vChannelOpisA=" + CapitilizeFirstLetter
DDEExecute (ChanNum, Var)
DDETerminate ChanNum

End Sub

名为Var的变量最后应该是utf-8编码,以将它们写为xml文件.
我也可以从文件中读取一个字符串,而不是使用DDERequest获取它们.

最佳答案 根据我的经验,Vba utf-8,iso 8859-1可能很棘手,因为它取决于文件源编辑器和系统环境,如果unix或windows或mac …大多数文本源编辑器或系统使用ANSI.我建议你试试adodb,因为它可能会读取并呈现utf-8而另一个更好地写出utf-8.

...
Set objStream = CreateObject("ADODB.Stream")
...
Dim ftxt As object
...

const bufFile = "c:\test.xml"
const stf = "c:\test_edited.xml"
Dim vData As Variant
Dim ftxt As TextStream
'ADODB
adoRead.Charset = "unicode"
adoRead.Open
adoRead.LoadFromFile bufFile
vData = Split(adoRead.ReadText, vbCrLf)
'ado object to write
Set fil = fso.GetFile(stf)
Set ftxt = fil.OpenAsTextStream(ForWriting, TristateUseDefault)

'process your data as intended
For j = LBound(vData) To UBound(vData)
   'code to capitalize...
   '...
   'write to
   ftxt.WriteLine vData(j)
Next j

这个结构对我来说对法语字符有用,我认为它应该在utf-8或unicode字符集中相同.

干杯

帕斯卡尔

点赞