Skip to content
Snippets Groups Projects
  • Peter De Wachter's avatar
    15505ee4
    Make UTF-8 the default encoding for XML feeds · 15505ee4
    Peter De Wachter authored
    Consider the feed http://planet.haskell.org/atom.xml
    - This is a UTF-8 encoded XML file
    - No encoding declaration in the XML header
    - No Unicode byte order mark
    - Served with HTTP Content-Type "text/xml" (no charset parameter)
    
    Miniflux lets charset.NewReader handle this. The charset package
    implements the HTML5 character encoding algorithm, which, in this
    situation, defaults to windows-1252 encoding if there are no UTF-8
    characters in the first 1000 bytes. So for this feed, we get the wrong
    encoding.
    
    I inserted an explicit "utf8.Valid()" check, which fixes this problem.
    15505ee4
    History
    Make UTF-8 the default encoding for XML feeds
    Peter De Wachter authored
    Consider the feed http://planet.haskell.org/atom.xml
    - This is a UTF-8 encoded XML file
    - No encoding declaration in the XML header
    - No Unicode byte order mark
    - Served with HTTP Content-Type "text/xml" (no charset parameter)
    
    Miniflux lets charset.NewReader handle this. The charset package
    implements the HTML5 character encoding algorithm, which, in this
    situation, defaults to windows-1252 encoding if there are no UTF-8
    characters in the first 1000 bytes. So for this feed, we get the wrong
    encoding.
    
    I inserted an explicit "utf8.Valid()" check, which fixes this problem.