Moin wiki dump
5-10 years ago I wrote a lot of content on my personal wiki. Notes, thoughts, business ideas, time tracking. Many different things, most of it irrelevant now. But some of it actually dated back to when my father died (in 2005), and those notes and thoughts mean a lot to me.
So I decided to copy it into my current notes directory as plain text files.
At the time I was running my own MoinMoin wiki installation. This wiki system stores each revision of a page in file named by an incrementing revision integer, like this:
➜ ls -la /Users/jtj/Dropbox/old data/chopwiki/data/pages/VmwareTodo/revisions/
total 24
drwxr-x---@ 5 jtj staff 170B Oct 24 2012 ./
drwxr-x---@ 6 jtj staff 204B Oct 24 2012 ../
-rw-r-----@ 1 jtj staff 98B Oct 24 2012 00000001
-rw-r-----@ 1 jtj staff 95B Oct 24 2012 00000002
-rw-r-----@ 1 jtj staff 95B Oct 24 2012 00000003
To extract the relevant content, I simply needed to copy the most recent version of each page, giving it a suitable filename.
After a bit of tinkering, I ended up with the following Ruby code which I could paste into a Rails console:
What I end up with is one page of raw text per original wiki page. Like this:
➜ /Users/jtj/notes/chop-wiki ls -la | head -n 20
total 2216
-rw-r----- 109 Nov 6 01:05 (c385)rhus_musik_web.md
-rw-r----- 1281 Nov 6 01:05 (c398)jenl(c3a6)ger.md
-rw-r----- 1041 Nov 6 01:05 (c398)nske_seddel_f(c3b8)dselsdag2006.md
-rw-r----- 285 Nov 6 01:05 (c398)nsker_jul2006.md
-rw-r----- 24 Nov 6 01:05 aarhus_rb.md
-rw-r----- 1255 Nov 6 01:05 acure_noter(2f)epm_noter.md
-rw-r----- 5333 Nov 6 01:05 acure_noter(2f)pem_noter.md
-rw-r----- 958 Nov 6 01:05 acure_noter(2f)ssh_tunnel.md
-rw-r----- 3933 Nov 6 01:05 acure_noter(2f)websee_strategi.md
-rw-r----- 2031 Nov 6 01:05 acure_noter.md
-rw-r----- 1578 Nov 6 01:05 apple_setup.md
-rw-r----- 230 Nov 6 01:05 apple_subscription.md
-rw-r----- 520 Nov 6 01:05 avi2_dvd.md
-rw-r----- 235 Nov 6 01:05 b(c3b8)ger.md
-rw-r----- 49585 Nov 6 01:05 bad_content.md
-rw-r----- 936 Nov 6 01:05 bil_problem02042008.md
-rw-r----- 2839 Nov 6 01:05 blogging_ideas.md
The special characters are messed up, but that can be fixed in a later step.