A List of 8 Lists! Programming Principles and Best Practices

I’ve been studying up for my new job, which is beginning next Monday.  Even though I’m going to be a Systems Administrator and not a programmer, I have a feeling I am going to need all the programming skills I can get my hands on to contribute the maximum to the database and product development efforts.

Here is a subset of the links I found – enjoy!

  1. 5 Habits of Highly Profitable Software Developers
  2. 8 Elite Development Principles
  3. 10 Essential Development Practices
  4. 10 SQL Server Integration Best Practices
  5. 15 Best Practices for Writing Super Readable Code
  6. 20 Design Tips for MySQL Enterprise Data Architects
  7. 28 Essays in Best Software Writing
  8. 97 Things Every Programmer Should Know
Also posted in blog entries | Tagged , , | Leave a comment

Fighting Financial Fraud: Audit Integrity, LLC

I recently had the chance to visit a company that is engaged in providing Accounting Fraud Risk Metrics for public US companies. Even though many of the data analysis techniques are the same, their overall outlook is vastly different from what I’m used to, due to their exclusive focus on company balance sheets and publicly released accounting data. In the world of quant research, especially so for HFT, there is a huge disconnect between the fundamentals and the information used to devise actionable strategies.

While I was working in the quant investment sector, it was almost pointless looking at the company balance sheets. For our purposes, it didn’t matter whether companies were healthy or unhealthy – only their trading characteristics mattered. And since were were focused on the market as a whole rather than any single corporation, balance sheet analysis really never came up at all. Even if it had, I would take what was written in the public filings with a grain of salt, after hearing about the shenanigans of Lehman Brothers, Bear Stearns and all those other wall street jokers.
(more…)

Also posted in blog entries | Tagged | Leave a comment

Update on my SiI 3114 SATA RAID5 array

Well, it’s been a week now since I setup the 4-disk RAID5 array using 16kb stripe size and writing with 64kb truecrypt cluster sizes. While the read performance was good, write performance was absolutely dismal.

I somewhat expected writes to be slow because of the parity calculations during RAID5 write operations, but the actual performance was far worse than I’m willing to tolerate. On a 2GB file, writes would start out at 40MB/s and then slow to a paltry 20MB/s after a minute or so. That was even after some setting improvements, because when I had a sub-optimal stripe size and cluster size combination, the same file would transfer at only 12MB/s. I would actually be OK with 20MB/s, but I found out that if I queued up a large amount of files to be written at once, it would cause the controller to crash after 50GB or so. TERRIBLE!
(more…)

Also posted in blog entries | Leave a comment

Finding a balance between security, stability, and performance with a SiI 3114 RAID controller

I have a Silicon Image 3114 SATA RAID controller built into my Asus motherboard and I was trying to decide on the optimal stripe size for my 4-disk RAID 5 array. The problem I’m having is, there are a lot of articles online about the tradeoffs between large and small stripe sizes, but I haven’t found a completely clear guide yet that applies specifically to my situation. I found that most guides say that for larger files, a small stripe size is better, and for numerous small files, a large stripe size is better.

I’m not sure, but I would guess that for the large files, a small stripe size is better because it would give finer-grained streaming, more speed because the reads and writes are more similar in size to the 512 bytes of a typical disk sector. (I wonder how RAID 5 will perform on the upcoming 4kb sector disks) I am also guessing that for smaller files, a large stripe size would be better because it would reduce the amount of seeks needed if multiple files are retrieved at once due to the large stripe (the largest available on my controller was 128kb, thats a lot of text at once, if I was running a database).

The best guide I found for RAID 5 stripe sizing
(more…)

Also posted in blog entries | Leave a comment

Latest News, as of March 16th

Ahhh… just had a nice talk with my Lab Manager and PI from the DNA sequencing lab today, it’s great to know that they’re watching out for me even though I’m going to be leaving for a new job… I went in to talk about the plans for their computer systems once I’m gone, but they wouldn’t let me go so easily! I told them about my plans to go to Taiwan if I can’t find a job, so they’re trying to find me one in San Diego to get me to stick around longer and train a mini-me freshman or sophomore who can help out with the CCDNASEQ servers.

On my home systems, I have some great new software that I’ve been trying and want to tell you guys about! But first, I want to talk about some hardware – I just installed a gigantic RAID 5 array made up of 4 320GB disks for a total of 894GB of usable space. It can survive one disk failure without the system going down, but I would have to replace the drive before a second one fails or else I would lose data. I’m not one to move all my data to new hardware without testing it, so I’m doing a full disk encryption using Truecrypt. It’s projected to take a total of 12 hours to completely overwrite the entire 894GB, hopefully everything will be A-OK in the morning.

So, on to the new software. Like all software I recommend on this blog, this software is 100% free (ad supported).

  1. Dataram RAMDisk
    • This is the best Free RAMDisk software available for Windows
    • Compatible with XP,Vista,7, 32bit and 64bit editions
    • Allows you to have up to 4GB RAMDisk to make your internet cache super fast (10X faster than your hard drive)
    • Since your browser only saves to your RAMDisk, wear on your HDD is reduced, making it less likely that it’ll die and bring your data along with it.
  2. NewsGator FeedDemon
    • This is the best Free Desktop RSS Reader available for Windows
    • The key here is that it allows you to prefetch the images for all your unread items.
    • Just set it to prefetch and come back in 10 minutes, combined with the use of a RAMDisk, that means NO WAITING to scroll through your subscribed feeds.
    • I hate waiting for images to load, even 250ms is really noticeable, and thats about how long it takes for each additional HTTP request! Prefetch FTW.
    • I used to be a hardcore Google Reader user, this feature alone is what caused me to switch.
Also posted in blog entries, coolness | Leave a comment