aaron swartz: the early works
secretGeek .:dot Nuts about dot Net:.
home .: about .: sign up .: sitemap .: secretGeek RSS

aaron swartz: the early works

I can't stop thinking about, wondering about, caring about, reading about the tragic life of Aaron Swartz. There's a lot I want to write. I think I could fill a book just trying to process what it means, what is an appropriate response, what's it all about. But I'm not going to attempt that.

I've been reading Aaron's blog, on and off, for over ten years. Ten years is a long time. And by my own estimates, those particular 10 years were the longest in history.

Long ago I printed out his HOWTO: Be more productive for multiple re-reads and have returned to it many times since.

I wanted to go back, right back, and try to work out the earliest stuff of his that I read. And I wanted to watch the progression of his ideas as they emerged.

From his blog 'raw thought' -- there's a link to 'Older Posts' which takes you to 'the archive' (grouped by theme).

From there is a link to 'Full Archives' which takes you to the reverse-chronological archives.

These stretch back to May 2005 (the oldest entry on that page is about a server crash after which he had to restart his blogging. Under the so called 'Full archives' section there's no link to anything prior to May 2005.

Now I'm certain he was blogging long before that -- I'm certain I was reading his blog long before that.

Is the stuff before that server crash lost? I hoped not, so I set about locating it.

I clearly remember his powerpoint remix (from 2003!) - it got published in a book of Joel Spolsky's - and I soon tracked that down.

Taking a look at the url suggests a numbered blogging system (from Dave Winer's Radio Userland), and from there it's easy to find all of his prior blog entries.

After a bit of binary searching I found what looks like Aaron's first Hello, world, with article id of '81'.

So I wrote a powershell script to download everything (I hardly think aaronsw would object !!) and found that the articles go from number 81 up to 1691, with a few gaps.

Here's the script.

# Downloads aaron's early stuff
# i've done this the hard way because i didn't have time to do it the easy way.

$client = new-object System.Net.WebClient

$nums = 81..1691

#detected up to 1691  (April 26, 2005)
$nums | % {
    $url = [string]::Format( "http://www.aaronsw.com/weblog/{0:000000}",$_)
    $path = join-path $(get-location) ([string]::Format("aaronsw_{0:000000}.html",$_))
    Write-Host "downloading " $url " to " $path
    $client.DownloadFile( $url, $path )
    
    #sleep for 4 seconds before grabbing, to give the server time to exhale.
    Start-Sleep -s 4
}

Then I wrote a script to walk through those files and create an archive page in the same style as Aaron's other archive pages.

It's not pretty code, it got the job done...

dir .\aaronsw_*.html | % {

    #extract the filenumber out of the name... i should've made this easier.
    $num = $_.Name.Split("_")[1].Split(".")[0] 
    
    #calculate the target url for this file
    $url = [string]::Format("http://www.aaronsw.com/weblog/{0}",$num)
    
    #load the file 
    $article = gc $_.Name

    #grab the title
    $titleRegex = [regex]'h1>(.*)</h1>'
    $title = $titleRegex.Match($article).Groups[1].Value
    
    #grab the time
    $timeRegex = [regex]'<p class="posted">posted ([^(]+) \('
    $time = $timeRegex.Match($article).Groups[1].Value
    
    #output the url, title and time, as html
    $item = [string]::Format('<p><a href="{0}">{1}</a> ({2})</p>',$url,$title,$time)
    $item >> archivePreCrash.html
}

So the result is this fairly complete list of pre-server crash articles:

 

aaronsw archive: early works

 

Now this takes us up to April 2005. And the post-crash articles start in May 2005, so it probably means that everything's accounted for, except maybe a month's worth of blogging. There are some missing articles within that period, and some lost stuff. I can see that he restored it from the wayback machine where possible, but sometimes there was nothing to grab.

There are a lot of gems in there (and of course a bit of drivel: this starts when he was 15). I was going to pull out a few quotes, but I'd rather let you do that for yourself. He was a thoughtful guy. It'd be great if he was still around.





'Chip Camden' on Sat, 19 Jan 2013 20:13:41 GMT, sez:

Aaron would be proud.



'lb' on Sun, 20 Jan 2013 12:46:39 GMT, sez:

thanks Chip ;-)



'OJ' on Mon, 21 Jan 2013 09:02:02 GMT, sez:

Great post LB.




name


website (optional)


enter the word:
 

comment (HTML not allowed)


All viewpoints welcome. Incivility is not tolerated, such comments are deleted.

 

I'm the co-author of TimeSnapper, a life analysis system that stores and plays-back your computer use. It makes timesheet recording a breeze, helps you recover lost work and shows you how to sharpen your act.

 

NimbleText - FREE text manipulation and data extraction

NimbleText is a Powerful FREE Tool

I wrote this, and use it every day for:

  • extracting data from text
  • manipulating text
  • generating code

It makes you look awesome. You should use NimbleText, you handsome devil!

 

Articles

NimbleText 2.0: More Than Twice The Price! NimbleText 2.0: More Than Twice The Price!
A Computer Simulation of Creative Work, or 'How To Get Nothing Done' A Computer Simulation of Creative Work, or 'How To Get Nothing Done'
NimbleText 1.9 -- BoomTown! NimbleText 1.9 -- BoomTown!
Line Endings. Line Endings.
**This** is how you pivot **This** is how you pivot
Art of the command-line helper Art of the command-line helper
Go and read a book. Go and read a book.
Slurp up mega-traffic by writing scalable, timeless search-bait Slurp up mega-traffic by writing scalable, timeless search-bait
Do *NOT* try this Hacking Script at home Do *NOT* try this Hacking Script at home
The 'Should I automate it?' Calculator The 'Should I automate it?' Calculator
aaron swartz: the early works aaron swartz: the early works
Finding (and removing) duplicate files on your hard drive Finding (and removing) duplicate files on your hard drive
Harvey, a .net chat server built with RabbitMQ Harvey, a .net chat server built with RabbitMQ
LeonBambrick.com LeonBambrick.com
So your domain has been stolen. What now? So your domain has been stolen. What now?
kv can remember it for you, wholesale kv can remember it for you, wholesale
Hello IT Department Hello IT Department
Dialog Between a Man and His Vista Laptop Dialog Between a Man and His Vista Laptop
NimbleText 1.6, Codename Jetboat NimbleText 1.6, Codename Jetboat
On Task Hoarding and Todo Bankruptcy On Task Hoarding and Todo Bankruptcy
Developer UI Done Right: Mercurial Commandline! Developer UI Done Right: Mercurial Commandline!
Rediscovering the Amstrad CPC 6128 Rediscovering the Amstrad CPC 6128
Just Wally Just Wally
The Correct Order for a First Time Viewing of The Lord Of The Rings The Correct Order for a First Time Viewing of The Lord Of The Rings
A new era for Android. A new era for Android.
Mind-boggling Demo of New Gaming Genre, aka Folder-Based Hangman, aka Fun with Recursion Mind-boggling Demo of New Gaming Genre, aka Folder-Based Hangman, aka Fun with Recursion
Got CSV in your javascript? Use agnes. Got CSV in your javascript? Use agnes.

Archives Complete secretGeek Archives

TimeSnapper -- Automated Screenshot Journal TimeSnapper: automatic screenshot journal

25 steps for building a Micro-ISV 25 steps for building a Micro-ISV
3 minute guides -- babysteps in new technologies: powershell, JSON, watir, F# 3 Minute Guide Series
Universal Troubleshooting checklist Universal Troubleshooting Checklist
Top 10 SecretGeek articles Top 10 SecretGeek articles
ShinyPower (help with Powershell) ShinyPower
Now at CodePlex

Realtime CSS Editor, in a browser RealTime Online CSS Editor
Gradient Maker -- a tool for making background images that blend from one colour to another. Forget photoshop, this is the bomb. Gradient Maker


[powered by Google] 


How to be depressed How to be depressed
You are not inadequate.



Recommended Reading


the little schemer


The Best Software Writing I
The Business Of Software (Eric Sink)

Recommended blogs

Jeff Atwood
Joseph Cooney
Phil Haack
Scott Hanselman
Julia Lerman
Rhys Parry
Joel Pobar
OJ Reeves
Eric Sink

Aggregated Links

proggit
dzone
hacker news
dot net kicks

Human Link Machines

interesting finds
a continuous learner's weblog
arjan's world
weekly link post

LinkedIn profile
LogEnvy - event logs made sexy
Computer, Unlocked. A rapid computer customization resource
Aussie Bushwalking
BrisParks :: best parks for kids in brisbane
PhysioTec, Brisbane Specialist Physiotherapy & Pilates
 
home .: about .: sign up .: sitemap .: secretGeek RSS .: © Leon Bambrick 2012 .: privacy

home .: about .: sign up .: sitemap .: RSS .: © Leon Bambrick 2006 .: privacy