Jump to content

Store wiki offline?


mess

Recommended Posts

just look around for some programs; there are a bunch of them that will do that for you. but please be careful and read the instructions twice; programs like these can create an incredible amount of traffic if misused... and that's bad for the page-owner and (maybe) also for you.

Another option is to use a Pro-Version of Acrobat, it has also a site-fetching function.

Set the values to remain on the domain and restrict the depth level!

Or do it the old way: just store some html-pages that are relevant for you; I think most of the Wiki is still quite manageable ;)

Cheers,

Michael

Link to comment
Share on other sites

I tried some of those programs in the past, with varying results...

right now I use the save function of my browser but it's not the nicest solution

I was hoping that the wiki itself had a feature to export the source or something like that...

my primary goal is to archive my own wiki page because it's my only documentation at the moment

(except from the source code)

Link to comment
Share on other sites

Sorry mess but I've looked into this and there's no easy way really... Most 'spider' type apps like you referred to (as you might already know) will not be able to collect the pages either :(

I was trying to make an midibox.org CD/DVD for those not able to get a fast internet connection but have basically abandoned the idea as it can't be automated....

Link to comment
Share on other sites

hm,  :-\

what's the issue with auto-downloaders (if restriced to 1 or 2 levels, html-docs only, staying on the domain?)

http://sourceforge.net/search/index.php?words=offline+grabber&sort=num_downloads&sortdir=desc&offset=0&type_of_search=soft&form_cat=18

Features of GetLeft:

- While it goes, it changes the original pages, all the links get changed to relative links, so that you can surf the site in your hard disk without those pesky absolute links.

- Limited Ftp support, it will download the files but not recursively.

- Resumes downloading if interrupted.

- Filters not to download certain kind of files.

- You can get a site map before downloading.

- Getleft can follow links to external sites.

- Multilingual support, at present Getleft supports Dutch, English, Esperanto, German, French, Italian, Polish, Korean, Portuguese, Russian, Turkish and Spanish.

- Some others not worth mentioning.

@stryd: if you want to provide an offline version, I could help (but not now, maybe if you pm me in one or two weeks?)

would be traffic friendly if we do such a thing just once and seed a zip or CD-Rom instead of everyone grabbing everything all the time...

Link to comment
Share on other sites

I've already a script which allows me to generate an offline version of ucapps.de and midibox.org from the original version which is located on my laptop HD (the script automatically fixes the links)

But it doesn't work on the forum and Wiki

Best Regards, Thorsten.

Link to comment
Share on other sites

Just FYI...

The problem is with dynamically generated PHP..... like this:

*/index.php?topic=7396.0;topicseen

This means that all of the forum pages are called index.php (or index.htm from the client side). A spider application that will save your files will name them differently every time as it will have to make up a random name to save the file as... This means that it would have to change all the links in all the posts to point to these names, and in order to be updateable without fully regenerating the site mirror, it would have to maintain a database of the changed links. None of the freely available spider apps support this operation, so every time the CD was made, we would hammer poor twin-x's servers by re-downloading the entire content.

The wiki can be exported from the web server by a dokuwiki add-on, but it will only run on a newer version of the PHP server than twin-x is running, so he would have to upgrade and risk stuffing up all his hosted sites.

Now of course we could get the source content and run a mirror using a PHP server on a DVD (so you would insert the DVD and it would run a web server on your local PC that would work just like the one you're reading now) but of course that would make the inner working of the forum, including the logins etc, available to all who had the DVD. I like you guys, but I'm not telling my password! ;)

I hate to be a stick in the mud but really there's no practical way of getting the forum or wiki working offline at present, without causing twin-x, "our hero" a great deal of hassle/money/downtime

Sorry guys, I tried, really I did!

Link to comment
Share on other sites

so, folks, time to get serious:

no one tried the link of getLeft, right?  :P tsss... ;)

so did I... but now I tried, because I'm a PHP-programmer too and I cannot imagine that all application developers are so dumb to ignore automatically generated PHP pages.

After installing GetLeft (which took me about 5.2 seconds) I am now getting a site-copy of the section "Application Development" from the Wiki, 2 levels deep, 1 level external documents, ignoring PDFs, AVIs, MOVs, ZIPs  8)

This fine little app even has an option "update", so you must not snatch everything again once it's updated...

I could rip it in depth next week and then seed a file, how is that?

Should PDFs be part of the offline version? ZIPs too?

Cheers,

Michael

Link to comment
Share on other sites

I did try that one but had no luck, it kept naming the files strangely and I found a forum from others with the same problem... I  guess is was "pilot error"  :-[ Sorry!!!

Yeh PDF's and ZIPs would be good too :)

I will seed the thing too, I'm on a 1megabit upload so that should keep a few people happy ;)

Don't forget to throttle back the speed of your spider so it doesn't hammer the server. I suspect you've already done this but just to be safe :)

Link to comment
Share on other sites

How about moving the wiki to MediaWiki so it supports more formatting tags? There's also a script which comes with it which will allow one to easily dump the whole thing to static HTML.

Also, MediaWiki seems to run considerably faster than dokuwiki.

-Steve

It is far more difficult to use / create the user database for loggin in with mediawiki in combination with smf.

Link to comment
Share on other sites

It is far more difficult to use / create the user database for loggin in with mediawiki in combination with smf.

Ah, I could understand that then. What about allowing it to be a more traditional wiki where anyone can edit without an account? Vandalism doens't happen very often, and if it did, self-policing should take care of that. With the essentially eternal history, rolling back changes is trivial.

Link to comment
Share on other sites

I did try that one but had no luck, it kept naming the files strangely and I found a forum from others with the same problem... I  guess is was "pilot error"

hum... this is unlikely a pilot error: it might be that the program behaves differently on mac and win.

a typical filename is for ex: "doku.phpid=c_tips_and_tricks_for_pic_programming.html" ...should be okay for other platforms, shouldn't it?

cheers,

Michael

Link to comment
Share on other sites

thanks Twin-x :)

In the meanwhile I already grabbed the page with GetLeft and while checking the results I found it had left some errors within; whereas most of the links were formatted correctly, there were some second-level links, that GetLeft had forgotten to put the ".html" ending :(

If I knew how to use Grep with a NOT statement, one could correct this in 10 seconds...

Anyone knows how I could achieve this?

Find: [tt]href="doku.phpid=[something]"[/tt]

Replace with: [tt]href="doku.phpid=[something].html"[/tt]

Cheers,

Michael

Link to comment
Share on other sites

  • 4 months later...

Twin-x,

I'm on a beta team for something I run at the studio and don't have net access there. I really wanted to take the current postings (from a forum) with me each night, but no "offline" browsers would save the content of the forum due to some php crap or something (I'm a bit "web illiterate" ;)).

Do you think that thing would do it? I run Opera here, but it would be worth running FF just to do that. I've been having to save all of the relevant posts from there as separate HTML files and drag them with me each night.

Thanks for the tip!

George

BTW--- I run a thing called HTTrack (http://www.httrack.com) for saving most sites and it usually does well, even at the defaults. I keep a bunch of sites I need on CD.

Link to comment
Share on other sites

Twin-x,

I'm on a beta team for something I run at the studio and don't have net access there. I really wanted to take the current postings (from a forum) with me each night, but no "offline" browsers would save the content of the forum due to some php crap or something (I'm a bit "web illiterate" ;)).

Do you think that thing would do it? I run Opera here, but it would be worth running FF just to do that. I've been having to save all of the relevant posts from there as separate HTML files and drag them with me each night.

Thanks for the tip!

George

BTW--- I run a thing called HTTrack (http://www.httrack.com) for saving most sites and it usually does well, even at the defaults. I keep a bunch of sites I need on CD.

htttrack does not work good on the wiki. at least not with me.

Link to comment
Share on other sites

Yeah, I'm wondering more about that other forum though. I tried a few offline browsers in addition to HTTrack, back before I gave up, and they all just saved the main entry page. Saving the current posts has gotten to be a pain, as they all have to be named and such, plus I don't know which ones I'll need to have with me each night.

George

Link to comment
Share on other sites

FWIW- That didn't appear to work for the beta forum. It saved a bunch of links and stuff, but would hang with a password entry box when I tried to open it with no net access. It also gave me a bunch of messages about not having saved parts of the site when it was done, so I don't think the content I was interested in even came in.

No big deal anyway. :-\

George

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...