使用Ruby和Hpricot将xml转换为yaml – 这里出了什么问题?

我正在尝试输出一个xml文件blog.xml作为yaml,用于放入vision.app,这是一个在本地设计shopify电子商务网站的工具.

Shopify的yaml看起来像这样:

- id: 2
  handle: bigcheese-blog
  title: Bigcheese blog
  url: /blogs/bigcheese-blog
  articles:
    - id: 1
      title: 'One thing you probably did not know yet...'
      author: Justin
      content: Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
      created_at: 2005-04-04 16:00
      comments:
        - 
          id: 1
          author: John Smith
          email: john@smith.com
          content: Wow...great article man.
          status: published
          created_at: 2009-01-01 12:00
          updated_at: 2009-02-01 12:00
          url: ""
        - 
          id: 2
          author: John Jones
          email: john@jones.com
          content: I really enjoyed this article. And I love your shop! It's awesome. Shopify rocks!
          status: published
          created_at: 2009-03-01 12:00
          updated_at: 2009-02-01 12:00
          url: "http://somesite.com/"
    - id: 2
      title: Fascinating
      author: Tobi
      content: Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
      created_at: 2005-04-06 12:00
      comments:
  articles_count: 2
  comments_enabled?: true 
  comment_post_url: ""
  comments_count: 2
  moderated?: true

但是,样本myxml看起来像这样:

       <article>
          <author>Rouska Mellor</author>
          <blog-id type="integer">273932</blog-id>
          <body>Worn Again are hiring for a new Sales Director.

      To view the full job description and details of how to apply click &quot;here&quot;:http://antiapathy.org/?page_id=83</body>
          <body-html>&lt;p&gt;Worn Again are hiring for a new Sales Director.&lt;/p&gt;
      &lt;p&gt;To view the full job description and details of how to apply click &lt;a href=&quot;http://antiapathy.org/?page_id=83&quot;&gt;here&lt;/a&gt;&lt;/p&gt;</body-html>
          <created-at type="datetime">2009-07-29T13:58:59+01:00</created-at>
          <id type="integer">1179072</id>
          <published-at type="datetime">2009-07-29T13:58:59+01:00</published-at>
          <title>Worn Again are hiring!</title>
          <updated-at type="datetime">2009-07-29T13:59:40+01:00</updated-at>
        </article>
        <article>

我天真地假设从一种序列化数据格式转换到另一种序列化数据格式是相当简单的,我可以简单地这样做:

>> require 'hpricot'
=> true
>> b = Hpricot.XML(open('blogs.xml'))
>> puts b.to_yaml

但是我收到了这个错误.

NoMethodError: undefined method `yaml_tag_subclasses?' for Hpricot::Doc:Class
    from /usr/local/lib/ruby/1.8/yaml/tag.rb:69:in `taguri'
    from /usr/local/lib/ruby/1.8/yaml/rubytypes.rb:16:in `to_yaml'
    from /usr/local/lib/ruby/1.8/yaml.rb:391:in `call'
    from /usr/local/lib/ruby/1.8/yaml.rb:391:in `emit'
    from /usr/local/lib/ruby/1.8/yaml.rb:391:in `quick_emit'
    from /usr/local/lib/ruby/1.8/yaml/rubytypes.rb:15:in `to_yaml'
    from /usr/local/lib/ruby/1.8/yaml.rb:117:in `dump'
    from /usr/local/lib/ruby/1.8/yaml.rb:432:in `y'
    from (irb):6
    from :0
>>

如何以此问题顶部列出的形式获取数据输出?我已经尝试导入’yaml’宝石,认为我错过了一些这些方法,但这也没有帮助:

最佳答案 抱歉,Josh,我认为你在这里找到的是Hpricot和/或YAML库中的限制,纯粹而简单.

我不确定Hpricot是否曾以这种方式支持过YAML.有问题的方法是由YAML库动态添加到Object类,以及其他基本的Ruby类型,但由于某种原因没有出现在Hpricot :: Doc的定义中,即使Hpricot :: Doc确实似乎继承了间接来自Object.

我可以说我也复制了它,所以它不仅仅是你.

您可以非常轻松地添加缺少的方法:

class Hpricot::Doc
  def self.yaml_tag_subclasses?
    "true"
  end
end
b = Hpricot.XML(open('blogs.xml'))

但你会发现这不会让你更进一步.这是我得到的:

--- !ruby/object:Hpricot::Doc 
options: 
  :xml: true

所以我们不会像我们应该那样迭代容器.

此时,为了使用YAML库获得YAML支持,蛮力方式(可能是唯一的方法)是将to_yaml方法添加到Hpricot的类,以教他们如何正确输出YAML.看看“/usr/lib/ruby/1.8/yaml/rubytypes.rb”(在Mac上,就像“/System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib” /ruby/1.8/yaml/rubytypes.rb“)以获取每种基本Ruby类型的完成示例.您可能需要添加的类在C端定义:请参阅方法Init_hpricot_scan中的“hpricot / ext / hpricot_scan / hpricot_scan.rl”.

点赞