j4: (dodecahedron)
[personal profile] j4
Does anybody know of any good, preferably FREE (as in beer) software for converting RTF to HTML? We've been using r2h95 (which is shareware) but it doesn't work with Win2K, which we've just upgraded to. We need something that will run on Win2K or linux, preferably, though MacOS stuff could be considered.

Yes, I know we could roll our own, but it would be nice not to have to reinvent wheels which are already rolling along happily.

* * *

Whew. Just tried this one and while it does convert, the last line of the HTML it outputs is:


</font></B></font></font></font></B></font></font></Body></font></Body></Html>


*groan*

Date: 2004-06-22 01:27 am (UTC)
From: [identity profile] huskyteer.livejournal.com
</font></B></font></font></font></B></font></font></Body></font></Body></Html>

Wow, it's almost as good as Dreamweaver!

Date: 2004-06-22 01:31 am (UTC)
From: [identity profile] wechsler.livejournal.com
Try running the output of that through Tidy: http://tidy.sourceforge.net/ ? ;)

Date: 2004-06-22 01:52 am (UTC)
From: [identity profile] rbarclay.livejournal.com
You could always resort to OOo (http://openoffice.org/), even if it nearly redefines the meaning of BloatWare.

Date: 2004-06-22 01:53 am (UTC)
From: [identity profile] crazyscot.livejournal.com
My Debian box at work knows about unrtf, which sounds like it might do the job though it's no longer supported by the original author, who has gone down the shareware road - see http://home.comcast.net/~smithz/. Beware, though, that the darker depths of RTF are rumoured to be Microsoft-proprietary.

Date: 2004-06-22 01:56 am (UTC)
From: [identity profile] imc.livejournal.com
I've previously used GNU UnRTF (http://www.gnu.org/software/unrtf/unrtf.html) with a certain amount of success to read RTF files. I don't know much about its HTML output because I usually use the plain text filter. (It has LaTeX and PostScript filters too, but the HTML one claims to be the most developed.)

No idea what systems it runs on, but it certainly runs on Unix, and Mac is Unixy, right?

Date: 2004-06-22 02:02 am (UTC)
From: [identity profile] imc.livejournal.com
(OK, call me silly - I didn't notice you'd actually mentioned Linux in the question. You'll have no problems getting it to run on that.)

The latest version seems to be 0.19.1 and it's worth getting because 0.18.1 sometimes crashes and they claim to have fixed that (or at least some crashing).

Date: 2004-06-22 02:22 am (UTC)
From: [identity profile] j4.livejournal.com
Yeah, that helps, but it'd be nice to start with something a bit tidier...

Date: 2004-06-22 02:25 am (UTC)
From: [identity profile] j4.livejournal.com
I assume this is a replacement for MSOffice? If so, I'm afraid it's no help -- we don't get any say in how the original document is created, it will come to us as something which Word can output, whether we like it or not. :-(

Date: 2004-06-22 02:55 am (UTC)
From: [identity profile] rbarclay.livejournal.com
Yeah, it's an office suite (free as in speech). But I meant using it just as a converter, eg. open the file, save it as something different.

Date: 2004-06-22 03:05 am (UTC)
chrisvenus: (Default)
From: [personal profile] chrisvenus
Anything tha twrites HTML with two body tags is not to be trusted... Unfortunately I have no idea what would be best to convert. I'd probably do similar routes to somebody else's suggestion and open it in word and save it as non-bloaty HTML. Probably doesn't produce great HTML either but at least it wouldn't start wrapping body tags in font tags.... Eww! :)

And not helpful I know. I just needed to briefly release my anger at those body tags. :)

Date: 2004-06-22 04:13 am (UTC)
From: [identity profile] j4.livejournal.com
Oh, I see. ... Is its save-as-HTML any better than Word's, then?

Date: 2004-06-22 04:14 am (UTC)
From: [identity profile] j4.livejournal.com
Have you seen Word's save-to-HTML? Nested body/font tags would be the least of your worries!

Date: 2004-06-22 04:48 am (UTC)
chrisvenus: (Default)
From: [personal profile] chrisvenus
Its not too bad. At least it is valid. And with office 2000 you can get a html filter thingy from microsoft that will allow you to save without the office metainfo in there and this comes built in to later versions (I think). I'd much rather have bloated but valid HTML because that is easier to filter. If something is giving me two body tags, somethign that realyl shouldn't even be allowed I wouldn't trust it to give me any kind of markup that would be properly interpreted by a browser. On the other hand this is almost certainly a personal preference thing and I would agree that loading into word and saving is not the best option here anyway so its something of a moot point.

Date: 2004-06-22 04:58 am (UTC)
From: [identity profile] j4.livejournal.com
Agreed valid HTML is better than invalid, but Word's HTML has so much extra crap in it that "filtering it" involves basically rewriting the HTML from scratch. Not convinced that's any better than rolling your own RTF-to-HTML converter in the first place!

Date: 2004-06-22 06:20 am (UTC)
From: [identity profile] oldbloke.livejournal.com
How could it be worse?

Date: 2004-06-22 06:28 am (UTC)
From: [identity profile] oldbloke.livejournal.com
I just tried opening an rtf (created in Word) in StarOffice6 and saving it as html.
It puts more in then I'd like (blank lines replaced by p blocks with a style attribute), but fairly sensible other than that.
It does put some meta stuff in the top so you know it went through Star Office.
otoh, the original rtf was really a notepad file with zero interesting content, so i dunno how SO6 would get on with something more complex.
Can you run SO on your platform?

Date: 2004-06-22 06:59 am (UTC)
From: [identity profile] j4.livejournal.com
Dunno if we can run SO -- I don't have a linux box, only the boys have those. 8-) Will ask 'em.

"Something more complex" is the problem -- we get lots of stuff with weird-ass formatting which we're expected to preserve and/or turn into something useful. Lots of styles, headers, tables, lions, tigers, bears -- oh my! -- bells, whistles, Old Uncle Tom Cobbleigh and all.

Which I really should get back to. :-(

Date: 2004-06-22 07:41 am (UTC)
From: [identity profile] rbarclay.livejournal.com
I've no idea.

Date: 2004-06-22 09:54 am (UTC)
From: [identity profile] burkesworks.livejournal.com
Pretty sure you can run StarOffice on Win2k.... got a copy lying around doing nothing that you can have.

Date: 2004-06-23 01:06 am (UTC)
From: [identity profile] oldbloke.livejournal.com
If you can't run SO, you should be able to run its kissin'cousin OpenOffice - they cover almost all platforms between them.

June 2025

S M T W T F S
1234567
891011121314
15 161718192021
22232425262728
2930     

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Jan. 28th, 2026 10:33 pm
Powered by Dreamwidth Studios