2718.us blog » pyljxmlrpc http://2718.us/blog Miscellaneous Technological Geekery Tue, 18 May 2010 02:42:55 +0000 en hourly 1 http://wordpress.org/?v=3.0.4 Reposting from the AML TempSite http://2718.us/blog/2009/09/06/reposting-from-the-aml-tempsite/ http://2718.us/blog/2009/09/06/reposting-from-the-aml-tempsite/#comments Mon, 07 Sep 2009 01:43:41 +0000 2718.us http://2718.us/blog/?p=172 This is not likely to be of interest to many people, but for anyone who used uJournal (uJ) or AboutMyLife (AML), which absorbed uJ after its demise, it is worth knowing that there has been a temporary site up at http://aboutmylife.net/tempsite/ at which one can get a very bare dump of their entire journal.  For those interested, it may also be of interest to take all those entries and post them into one’s current journal.  Here is a process for doing that.

THIS INFORMATION IS PROVIDED AS-IS WITH NO EXPRESS OR IMPLIED WARRANTY. USE AT YOUR OWN RISK. It worked for me, but who knows what that may mean for you.

Requires: Python v2.something (maybe 2.4?)–Mac OS X 10.4 works fine, as will most current linux/unix things, I think.

  1. Go to the AML tempsite, log in, and save the file that shows up (which is all your entries, but totally lacking formatting, etc.) as “entries.html”
  2. Download pyLJxmlrpc.py from Google Code (I just put it there; I wrote it), save it in the same directory as entries.html
  3. Copy/paste the following into a file (I called it “processEntries.py” but it doesn’t really matter), and change the USERNAME and PASSWORD to the username and password of the account to which you want to post (you can also change “www.livejournal.com” to other journal sites–it should work on any LJ site that supports the XML-RPC protocol). line wrapping and whitespace are important
    
    #!/usr/bin/python
    
    import re
    
    f = open('entries.html')
    s = f.read()
    a = s.split('</td></tr><tr></tr><tr><td width="25%">')
    r = re.compile(r'([0-9]{4})-([0-9]{2})-([0-9]{2}) ([0-9]{2}):([0-9]{2}):[0-9]{2}</td><td width="75%">(.*)</td></tr><tr><td> </td><td>(.*)',re.DOTALL)
    
    processedEntries = {}
    for e in a:
        m = r.search(e)
        t = "%s-%s-%s %s:%s" % (m.group(1), m.group(2), m.group(3), m.group(4), m.group(5))
        processedEntries[t] = {'year':m.group(1), 'mon':m.group(2), 'day':m.group(3), 'hour':m.group(4), 'min':m.group(5), 'subject':m.group(6), 'body':m.group(7)}
    
    sk = processedEntries.keys()
    sk.sort()
    
    import pyLJxmlrpc
    
    lj = pyLJxmlrpc.pyLJxmlrpc()
    
    for k in sk:
        lj.call_withParams_atURL_forUser_withPassword_('postevent',{'event':processedEntries[k]['body'],'linenedings':'unix','subject':processedEntries[k]['subject'],'security':'private','year':processedEntries[k]['year'],'mon':processedEntries[k]['mon'],'day':processedEntries[k]['day'],'hour':processedEntries[k]['hour'],'min':processedEntries[k]['min'],'props':{'opt_backdated':True,'taglist':'aml-raw'}},'http://www.livejournal.com/interface/xmlrpc/','USERNAME','PASSWORD')
        print "%s: %s" % (k,processedEntries[k]['subject'])
    
  4. At a command prompt (Mac: run Terminal), change to the directory in which you saved the two .py files and entries.html, and run
    python processEntries.py

    and watch it go–it’ll only take a few seconds to pull apart the HTML file, but reposting entries takes time; it prints the date/subject of each entry *after* attempting to post it, so errors you might see pertain to the date/subject immediately after the error.

Every entry from AML that didn’t have an empty body will be posted with its date-time maintained, set to private, and backdated; you will see error messages for any entries that were blank (since the AML tempsite thing strips out all HTML, this left me with some blank entries where meme/quiz results had been).

]]>
http://2718.us/blog/2009/09/06/reposting-from-the-aml-tempsite/feed/ 0