Details

    • Bug
    • Resolution: Won't Fix
    • Minor
    • None
    • None
    • Mac OS/X 10.6.6 + Firefox 3.6.17
      Fedora 13 + Firefox 3.5.15
    • 1
    • 3
    • 7203

    Description

      There are a large number of "unknown" characters in the HTML version of the Lustre manual. On my system they appear as black diamonds with a question mark like '�'.

      For example, the copyright (C) character right at the start of the manual, and all of the accented characters in the Oracle boilerplate are shown this way, along with the hard-space (I guess) character in every section and subsection title is shown this way.

      Attachments

        Issue Links

          Activity

            [LUDOC-7] character set problems in HTML manual

            We won't be fixing this. The xhtml version renders fine - and that version is the one that is most commonly linked to.

            rhenwood Richard Henwood (Inactive) added a comment - We won't be fixing this. The xhtml version renders fine - and that version is the one that is most commonly linked to.
            adilger Andreas Dilger added a comment - - edited

            Sadly, this doesn't work:

            legalnoticeOracle.xml:17: parser error : Entity 'copy' not defined
            <para>Copyright &copy; 2011, Oracle et/ou ses affili&eacute;s. Tous droits
            ^
            legalnoticeOracle.xml:17: parser error : Entity 'eacute' not defined
            <para>Copyright &copy; 2011, Oracle et/ou ses affili&eacute;s. Tous droits

            adilger Andreas Dilger added a comment - - edited Sadly, this doesn't work: legalnoticeOracle.xml:17: parser error : Entity 'copy' not defined <para>Copyright &copy; 2011, Oracle et/ou ses affili&eacute;s. Tous droits ^ legalnoticeOracle.xml:17: parser error : Entity 'eacute' not defined <para>Copyright &copy; 2011, Oracle et/ou ses affili&eacute;s. Tous droits

            http://review.whamcloud.com/7739

            Some non-ASCII characters, such as accented letters, copyright (c),
            and single quotes (rather than apostrophes) were being rendered
            incorrectly in the HTML version of the manual, because of confusion
            between character sets (UTF-8 vs. ISO-8859-1).

            Instead of using the encoded characters directly in the manual,
            use the HTML escape sequences such as ©, à, etc.
            These can be rendered correctly for both the HTML and PDF manuals.

            This doesn't resolve the use of ISO-8859-1 hard spaces in the titles,
            but at least fixes the most visible mess at the start of the manual.

            adilger Andreas Dilger added a comment - http://review.whamcloud.com/7739 Some non-ASCII characters, such as accented letters, copyright (c), and single quotes (rather than apostrophes) were being rendered incorrectly in the HTML version of the manual, because of confusion between character sets (UTF-8 vs. ISO-8859-1). Instead of using the encoded characters directly in the manual, use the HTML escape sequences such as ©, à, etc. These can be rendered correctly for both the HTML and PDF manuals. This doesn't resolve the use of ISO-8859-1 hard spaces in the titles, but at least fixes the most visible mess at the start of the manual.

            The config of the webserver is both:

            • making browsers render some characters incorrectly.
            • prohibiting the manual appearing on search engine results.
            rhenwood Richard Henwood (Inactive) added a comment - The config of the webserver is both: making browsers render some characters incorrectly. prohibiting the manual appearing on search engine results.

            I believe the reason the 'x' makes a difference to the rendering is because the webserver that Jira uses is not very clever... I've discussed this in: https://jira.hpdd.intel.com/browse/LUDOC-7?focusedCommentId=17320&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17320

            rhenwood Richard Henwood (Inactive) added a comment - I believe the reason the 'x' makes a difference to the rendering is because the webserver that Jira uses is not very clever... I've discussed this in: https://jira.hpdd.intel.com/browse/LUDOC-7?focusedCommentId=17320&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17320
            adilger Andreas Dilger added a comment - - edited

            I just found that there are HTML codes for these:

            adilger Andreas Dilger added a comment - - edited I just found that there are HTML codes for these: copyright = & copy; trademark = & trade; accented characters - see http://symbolcodes.tlt.psu.edu/web/codehtml.html#accent

            It looks like the .xhtml version of the manual does not have this problem. Is that just because of the filename extension is not .html?

            adilger Andreas Dilger added a comment - It looks like the .xhtml version of the manual does not have this problem. Is that just because of the filename extension is not .html?

            one addition piece of information I've just noticed:

            lustre_manual.diff.html <- encoding appears correct
            lustre_manual.html <- encoding appears incorrect

            rhenwood Richard Henwood (Inactive) added a comment - one addition piece of information I've just noticed: lustre_manual.diff.html <- encoding appears correct lustre_manual.html <- encoding appears incorrect

            Yes, Joshua could probably understand what is going on here a lot faster than I could.

            adilger Andreas Dilger added a comment - Yes, Joshua could probably understand what is going on here a lot faster than I could.

            Would this be something for Joshua to help resolve?

            jessica Jessica A. Popp (Inactive) added a comment - Would this be something for Joshua to help resolve?

            People

              rhenwood Richard Henwood (Inactive)
              adilger Andreas Dilger
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: