User Tools

Site Tools


home:software:windows:word2doku

Microsoft Word to DokuWiki Converter

This is a vbScript that will do basic formatting conversion from word to dokuwiki. It's not perfect, but it will save you a lot of time if you have lots of word docs like me. Also, cells pasted into word from excel will be converted, so it's almost an excel converter.

It's poorly written ugly uncommented crappy code that has come to you as a fixed up version of an old version which supported old syntax which was created by someone else mashing together conversion code for openoffice and mediawiki. So yeh, it's crap!! Here's a readme that, just like the code, is mostly ripped off from other people. I give all the credit for this to whoever it was that did it, except for the bits I fixed up so it would actually work ;) So here's a fixed up version of their readme:

INTRO

Convert Microsoft Word document content to MediaWiki markup.

This is a Word Visual Basic macro. Usage requires a running copy of Microsoft Word that supports Visual Basic macros. (Word 97 or greater). Features

  • Replaces smart quotes/double-quotes with dumb equivalents
  • Escapes the following characters: * # { } [ ] ~ ^^ | '
  • Converts external hyperlinks
  • Converts H1-H5 headings
  • Converts bold/italic/underline/strikethrough/superscript/subscript formatting
  • Converts bulleted/numbered lists

CAVEATS

  • No guarantees, no official support, just like the DokuWiki itself (or Word =} )
  • No in-document hyperlink conversion (so make Wiki links yourself after pasting)
  • May not work well with documents that have highly customized styles/templates. It works best with documents written in the default “Normal” template
  • Image/graphics/figures are not supported.
  • Tables go a bit crazy if you have carriage returns inside the cells… I can't get around that, it's a microsoft thing.
  • | or ^ characters within tables get converted to I and /\ or they'll screw it up.

INSTALLATION

 1. Copy the code from below, and save it to a file on your disk named Word2Doku.bas
 2. Start Word
 3. Bring up the Visual Basic Editor (Tools->Macro->Visual Basic Editor or Alt+F11).
 4. From the VBE, import the macro library (File->Import File...) and select the file you downloaded.

USAGE

 1. Open a word document to convert
 2. Run the Word2DokuWiki macro by bringing up the Macros dialog (Tools->Macro->Macros or Alt+F8), selecting Word2DokuWiki and clicking Run.
 3. The macro converts the document to DokuWiki markup and places a copy of the content on the system clipboard.
 4. Switch to a DokuWiki editor page (or new page) and paste the result in.
 

ATTRIBUTION

I did not come up with this macro, I found it in pieces all over the web and made for other wiki formats, and strapped it all together. If you're a MIDIboxer, just hit me up if you need support.

Cheers!

CODE

Attribute VB_Name = "Word2DokuWikiv3"

Sub Word2DokuWiki()
    Application.ScreenUpdating = False
    ReplaceQuotes
    DokuWikiEscapeChars
    DokuWikiConvertHyperlinks
    DokuWikiConvertH1
    DokuWikiConvertH2
    DokuWikiConvertH3
    DokuWikiConvertH4
    DokuWikiConvertH5
    DokuWikiConvertItalic
    DokuWikiConvertBold
    DokuWikiConvertUnderline
    DokuWikiConvertStrikeThrough
    DokuWikiConvertSuperscript
    DokuWikiConvertSubscript
    DokuWikiConvertLists
    DokuWikiConvertTable
    UndoDokuWikiEscapeChars
    ' Copy to clipboard
   ActiveDocument.Content.Copy
   Application.ScreenUpdating = True
End Sub

Private Sub DokuWikiConvertH1()
    ReplaceHeading wdStyleHeading1, "======"
End Sub

Private Sub DokuWikiConvertH2()
    ReplaceHeading wdStyleHeading2, "====="
End Sub

Private Sub DokuWikiConvertH3()
    ReplaceHeading wdStyleHeading3, "===="
End Sub

Private Sub DokuWikiConvertH4()
        ReplaceHeading wdStyleHeading4, "==="
End Sub

Private Sub DokuWikiConvertH5()
    ReplaceHeading wdStyleHeading5, "=="
End Sub

Private Sub DokuWikiConvertH6()
    ReplaceHeading wdStyleHeading5, "="
End Sub

Private Sub DokuWikiConvertBold()
    ActiveDocument.Select
    With Selection.Find
        .ClearFormatting
        .Font.Bold = True
        .Text = ""
        .Format = True
        .MatchCase = False
        .MatchWholeWord = False
        .MatchWildcards = False
        .MatchSoundsLike = False
        .MatchAllWordForms = False
        .Forward = True
        .Wrap = wdFindContinue
       
        Do While .Execute
            With Selection
                If Len(.Text) > 1 And InStr(1, .Text, vbCr) Then
                    ' Just process the chunk before any newline characters
                    ' We'll pick-up the rest with the next search
                    .Collapse
                    .MoveEndUntil vbCr
                End If
                                      
                ' Don't bother to markup newline characters (prevents a loop, as well)
                
                If Not .Text = vbCr Then
                    If Not Left(.Text, 2) = "**" Then
                    .InsertBefore "**"
                    End If
                    If Not Right(.Text, 2) = "**" Then
                    .InsertAfter "**"
                    End If
                End If
               
                .Style = ActiveDocument.Styles("Default Paragraph Font")
                .Font.Bold = False
            End With
        Loop
    End With
End Sub
 
Private Sub DokuWikiConvertItalic()
    ActiveDocument.Select
   
    With Selection.Find
   
        .ClearFormatting
        .Font.Italic = True
        .Text = ""
       
        .Format = True
        .MatchCase = False
        .MatchWholeWord = False
        .MatchWildcards = False
        .MatchSoundsLike = False
        .MatchAllWordForms = False
       
        .Forward = True
        .Wrap = wdFindContinue
       
        Do While .Execute
            With Selection
                If Len(.Text) > 1 And InStr(1, .Text, vbCr) Then
                    ' Just process the chunk before any newline characters
                    ' We'll pick-up the rest with the next search
                    .Collapse
                    .MoveEndUntil vbCr
                End If
                                      
                ' Don't bother to markup newline characters (prevents a loop, as well)
                If Not .Text = vbCr Then
                    If Not Left(.Text, 2) = "//" Then
                    .InsertBefore "//"
                    End If
                    If Not Right(.Text, 2) = "//" Then
                    .InsertAfter "//"
                    End If
                End If
               
                .Style = ActiveDocument.Styles("Default Paragraph Font")
                .Font.Italic = False
            End With
        Loop
    End With
End Sub
 
Private Sub DokuWikiConvertUnderline()
    ActiveDocument.Select
   
    With Selection.Find
   
        .ClearFormatting
        .Font.Underline = True
        .Text = ""
       
        .Format = True
        .MatchCase = False
        .MatchWholeWord = False
        .MatchWildcards = False
        .MatchSoundsLike = False
        .MatchAllWordForms = False
       
        .Forward = True
        .Wrap = wdFindContinue
       
        Do While .Execute
            With Selection
                If Len(.Text) > 1 And InStr(1, .Text, vbCr) Then
                    ' Just process the chunk before any newline characters
                    ' We'll pick-up the rest with the next search
                    .Collapse
                    .MoveEndUntil vbCr
                End If
                                       
                ' Don't bother to markup newline characters (prevents a loop, as well)
                If Not .Text = vbCr Then
                    If Not Left(.Text, 2) = "__" Then
                    .InsertBefore "__"
                    End If
                    If Not Right(.Text, 2) = "__" Then
                    .InsertAfter "__"
                    End If
                End If
                
                .Style = ActiveDocument.Styles("Default Paragraph Font")
                .Font.Underline = False
            End With
        Loop
    End With
End Sub
 
Private Sub DokuWikiConvertStrikeThrough()
    ActiveDocument.Select
   
    With Selection.Find
   
        .ClearFormatting
        .Font.StrikeThrough = True
        .Text = ""
       
        .Format = True
        .MatchCase = False
        .MatchWholeWord = False
        .MatchWildcards = False
        .MatchSoundsLike = False
        .MatchAllWordForms = False
       
        .Forward = True
        .Wrap = wdFindContinue
       
        Do While .Execute
            With Selection
                If Len(.Text) > 1 And InStr(1, .Text, vbCr) Then
                    ' Just process the chunk before any newline characters
                    ' We'll pick-up the rest with the next search
                    .Collapse
                    .MoveEndUntil vbCr
                End If
                                      
                ' Don't bother to markup newline characters (prevents a loop, as well)
                If Not .Text = vbCr Then
                    If Not Left(.Text, 2) = "<del>" Then
                    .InsertBefore "<del>"
                    End If
                    If Not Right(.Text, 2) = "</del>" Then
                    .InsertAfter "</del>"
                    End If
                End If
               
                .Style = ActiveDocument.Styles("Default Paragraph Font")
                .Font.StrikeThrough = False
            End With
        Loop
    End With
End Sub
 
Private Sub DokuWikiConvertSuperscript()
    ActiveDocument.Select
   
    With Selection.Find
   
        .ClearFormatting
        .Font.Superscript = True
        .Text = ""
       
        .Format = True
        .MatchCase = False
        .MatchWholeWord = False
        .MatchWildcards = False
        .MatchSoundsLike = False
        .MatchAllWordForms = False
       
        .Forward = True
        .Wrap = wdFindContinue
       
        Do While .Execute
            With Selection
                .Text = Trim(.Text)
                If Len(.Text) > 1 And InStr(1, .Text, vbCr) Then
                    ' Just process the chunk before any newline characters
                    ' We'll pick-up the rest with the next search
                    .Collapse
                    .MoveEndUntil vbCr
                End If
                                       
                ' Don't bother to markup newline characters (prevents a loop, as well)
                If Not .Text = vbCr Then
                    If Not Left(.Text, 2) = "<sup>" Then
                    .InsertBefore "<sup>"
                    End If
                    If Not Right(.Text, 2) = "</sup>" Then
                    .InsertAfter "</sup>"
                    End If
                End If
                
                .Style = ActiveDocument.Styles("Default Paragraph Font")
                .Font.Superscript = False
            End With
        Loop
    End With
End Sub
 
Private Sub DokuWikiConvertSubscript()
    ActiveDocument.Select
   
    With Selection.Find
   
        .ClearFormatting
        .Font.Subscript = True
        .Text = ""
       
        .Format = True
        .MatchCase = False
        .MatchWholeWord = False
        .MatchWildcards = False
        .MatchSoundsLike = False
        .MatchAllWordForms = False
       
        .Forward = True
        .Wrap = wdFindContinue
       
        Do While .Execute
            With Selection
                .Text = Trim(.Text)
                If Len(.Text) > 1 And InStr(1, .Text, vbCr) Then
                    ' Just process the chunk before any newline characters
                    ' We'll pick-up the rest with the next search
                    .Collapse
                    .MoveEndUntil vbCr
                End If
                                       
                ' Don't bother to markup newline characters (prevents a loop, as well)
                If Not .Text = vbCr Then
                    If Not Left(.Text, 2) = "<sub>" Then
                    .InsertBefore "<sub>"
                    End If
                    If Not Right(.Text, 2) = "</sub>" Then
                    .InsertAfter "</sub>"
                    End If
                End If
               
                .Style = ActiveDocument.Styles("Default Paragraph Font")
                .Font.Subscript = False
            End With
        Loop
    End With
End Sub
 
Private Sub DokuWikiConvertLists()
    Dim para As Paragraph
    For Each para In ActiveDocument.ListParagraphs
        With para.Range
            .InsertBefore "  "
             If .ListFormat.ListType = wdListBullet Then
                 .InsertBefore "*"
             Else
                  .InsertBefore "-"
              End If
            For i = 1 To .ListFormat.ListLevelNumber
                   .InsertBefore "  "
           Next i
            .ListFormat.RemoveNumbers
        End With
    Next para
End Sub
 
Private Sub DokuWikiConvertHyperlinks()
    Dim hyperCount As Integer
   
    hyperCount = ActiveDocument.Hyperlinks.Count
   
    For i = 1 To hyperCount
        With ActiveDocument.Hyperlinks(1)
            Dim addr As String
            addr = .Address
            .Delete
            .Range.InsertBefore "["
            .Range.InsertAfter "-" & addr & "]"
        End With
    Next i
End Sub
 
' Replace all smart quotes with their dumb equivalents
Private Sub ReplaceQuotes()
    Dim quotes As Boolean
    quotes = Options.AutoFormatAsYouTypeReplaceQuotes
    Options.AutoFormatAsYouTypeReplaceQuotes = False
    ReplaceString ChrW(8220), """"
    ReplaceString ChrW(8221), """"
    ReplaceString "ë", "'"
    ReplaceString "í", "'"
    Options.AutoFormatAsYouTypeReplaceQuotes = quotes
End Sub
 
Private Sub DokuWikiEscapeChars()
    EscapeCharacter "*"
    EscapeCharacter "#"
    EscapeCharacter "_"
    EscapeCharacter "-"
    EscapeCharacter "+"
    EscapeCharacter "{"
    EscapeCharacter "}"
    EscapeCharacter "["
    EscapeCharacter "]"
    EscapeCharacter "~"
    EscapeCharacter "^^"
    EscapeCharacter "|"
    EscapeCharacter "'"
End Sub
 
Private Function ReplaceHeading(styleHeading As String, headerPrefix As String)
    Dim normalStyle As Style
    Set normalStyle = ActiveDocument.Styles(wdStyleNormal)
   
    ActiveDocument.Select
   
    With Selection.Find
   
        .ClearFormatting
        .Style = ActiveDocument.Styles(styleHeading)
        .Text = ""

      
        .Format = True
        .MatchCase = False
        .MatchWholeWord = False
        .MatchWildcards = False
        .MatchSoundsLike = False
        .MatchAllWordForms = False
       
        .Forward = True
        .Wrap = wdFindContinue
       
        Do While .Execute
            With Selection
                If InStr(1, .Text, vbCr) Then
                    ' Just process the chunk before any newline characters
                    ' We'll pick-up the rest with the next search
                    .Collapse
                    .MoveEndUntil vbCr
                End If
                                       
                ' Don't bother to markup newline characters (prevents a loop, as well)
               If Not .Text = vbCr Then
                   .InsertBefore headerPrefix
                   .InsertBefore vbCr
                   .InsertAfter headerPrefix
               End If
               .Style = normalStyle
           End With
       Loop
   End With
End Function

Private Sub DokuWikiConvertTable()
Dim TotTables As Long
Do While ActiveDocument.Tables.Count() > 0
ActiveDocument.Tables(1).Range.Select
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
With Selection.Find
.Text = " $s$|$s$ "
.Replacement.Text = "I"
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
With Selection.Find
.Text = " $s$^^$s$ "
.Replacement.Text = "/\"
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
Selection.Find.ClearFormatting
Application.DefaultTableSeparator = "|"
Selection.Rows.ConvertToText Separator:=wdSeparateByDefaultListSeparator, NestedTables:=True
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
With Selection.Find
.Text = "^p"
.Replacement.Text = "|^p|"
.Forward = True
.Wrap = wdFindStop
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
Selection.InsertBefore ("|")
Selection.InsertParagraphAfter
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
With Selection.Find
.Text = "^p|^p"
.Replacement.Text = "^p"
.Forward = True
.Wrap = wdFindStop
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
With Selection.Find
.Text = "$s$blank$s$"
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
With Selection.Find
.Text = "||"
.Replacement.Text = "|  |"
.Forward = True
.Wrap = wdFindStop
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
With Selection.Find
.Text = "||"
.Replacement.Text = "|  |"
.Forward = True
.Wrap = wdFindStop
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
With Selection.Find
.Text = "| |"
.Replacement.Text = "|  |"
.Forward = True
.Wrap = wdFindStop
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
With Selection.Find
.Text = "| |"
.Replacement.Text = "|  |"
.Forward = True
.Wrap = wdFindStop
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
Selection.Paragraphs(1).Range.Select
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
With Selection.Find
.Text = "|"
.Replacement.Text = "^^"
.Forward = True
.Wrap = wdFindStop
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
Loop
End Sub
Private Sub UndoDokuWikiEscapeChars()

    UndoEscapeCharacter "*"
    UndoEscapeCharacter "#"
    UndoEscapeCharacter "_"
    UndoEscapeCharacter "-"
    UndoEscapeCharacter "+"
    UndoEscapeCharacter "{"
    UndoEscapeCharacter "}"
    UndoEscapeCharacter "["
    UndoEscapeCharacter "]"
    UndoEscapeCharacter "~"
    UndoEscapeCharacter "^^"
    UndoEscapeCharacter "|"
    UndoEscapeCharacter "'"

End Sub

Private Function EscapeCharacter(char As String)
    ReplaceString char, " $s$" & char & "$s$ "
End Function

Private Function UndoEscapeCharacter(char As String)
    ReplaceString " $s$" & char & "$s$ ", char
End Function

Private Function ReplaceString(findStr As String, replacementStr As String)
    Selection.Find.ClearFormatting
    Selection.Find.Replacement.ClearFormatting
    With Selection.Find
        .Text = findStr
        .Replacement.Text = replacementStr
        .Forward = True
        .Wrap = wdFindContinue
        .Format = False
        .MatchCase = False
        .MatchWholeWord = False
        .MatchWildcards = False
        .MatchSoundsLike = False
        .MatchAllWordForms = False
    End With
    Selection.Find.Execute Replace:=wdReplaceAll
End Function

home/software/windows/word2doku.txt · Last modified: 2008/12/20 02:52 by stryd_one