Can’t Wait to use Pivot to Explore My Data

Thought about this a lot but never saw any dataviz program like this before. Hans Rosling’s program was very informative, but based on relatively simple government data. Pivot is made to draw from much larger sets; it’s designed for making sense of the modern internet.

I like how Pivot will let me view information the way I want to view it – start off by viewing everything all at once, then drilling down in more detail after taking a look at trends and common qualities of large bins of data.

There’s so many things I want to use this for… I hope I can be diligent in my studies and eventually be a good enough programmer to develop something that will take advantage of Pivot’s capabilities. Even if I never get there, somebody out there is definitely going to be able to make an amazing tool out of this, I’ll just use theirs once they release it.

Also posted in blog entries, video | 1 Comment

February 2010 Status Update

I started doing some consulting for some home clients recently, in La Jolla and Pacific Beach. They’re all private residences, where I help fix general networking and computer problems. One client had a huge list of problems but I was somehow able to fix all of them in a few hours! That allowed us to move on to planning her blog and website layouts, getting a lot done but also leaving a lot more work to do later on.

Jobs like these are kind of fun for me – I love being able to solve problems for people, especially nice ones who don’t know where to go for help. (usually having to visit best buy etc…)

Fixing these small problems that just require a little bit of experience, it feels like such a mutually beneficial exchange; to me, the gratitude I’ve received feels great and then I even get paid! My clients (several moms) are getting a great deal because the value of having annoying problems solved is so much greater than the cost to get rid of them.

Also posted in blog entries | Leave a comment

Server Issues

Sorry for not posting much lately, I ran into a hosting problem this month where some of the permissions on my server were set incorrectly during an account transfer. (the migration script’s fault, not mine)

I don’t have root access on my server, so I wasn’t able to log in or make posts until I found all the offending files and contacted support about them.

The good news is that I’ve had a lot of other things to work on that were keeping me away, including installing some virtual appliances, searching for a job, and getting a lot of computing hardware setup and networked.

Also posted in blog entries | Leave a comment

My Amazon EC2 server specs

So lately I’ve been working on a multi-user database on Amazon EC2 using Ubuntu Server 9.10 (Karmic Koala) and postgreSQL 8.3.

I wrote in the last entry about how cool it is that you can connect hard drives so easily, and I want to say something more about it. While I was reading the fine manual, I learned about partitioning the database for increased performance.

When I say that my database is fully partitioned now, that means that I’ve spread my data out over multiple drives so that multiple read and write operations can be going on at the same time, where a single drive would be wasting a lot of time in seek mode. Thanks to Amazon’s Elastic Block Storage, I can create disk volumes that are only as large as I need them to be. In this case it’s 1-5GB (when’s the last time you saw a 2GB HDD for sale?) If I was building a computer on my own, the smallest I could get would be 40GB, so if I wanted to attach 5 of those, it would cost quite a bit even using small drives. But on Amazon, I only pay $0.15 / GB for every month that I have the volumes reserved. Checking my current AWS statement, for the 15 days in November it’s only cost me ~$1.50 for the hard drive space and ~$20 for the server time. Sweet!

Now, it may sound really great when I say it like this, but I’m still not sure how well the drives perform vs a non-cloud drive. So the next step is, I’m going to try to figure out what the standard benchmarking procedure is and do some testing.

Here’s the specs for my drive setup:
5GB mounted from /dev/sdh – used to store raw data
1GB mounted from /dev/sdg – used to store write-ahead logs for postgreSQL
2GB mounted from /dev/sdh – used to store pre-generated queries of type A
3GB mounted from /dev/sdi – used to store pre-generated queries of type B
2GB mounted from /dev/sdj – used to store pre-generated queries of type C

Also posted in blog entries | Tagged , , , , | 2 Comments

One of my data scientist heroes

Design already requires superior technical skills, and those requirements will only increase now that it’s becoming more closely linked to data visualization. This is because the value of data can only be unlocked when it analyzed and turned into information. This important work will be done by a new generation of data scientists.

Lisa Strausfeld is an example of a data scientist hero. Her goal is to change the way we view government statistics and make them accessible and understandable to more people through design. Check out the link below to read about her experience.

Fast Company’s Masters of Design – Lisa Strausfeld

Posted in work-related | Tagged , | Leave a comment