Archive for the ‘Uncategorized’ Category

Facebook & Open Source: Community is just as important as the Code

Tuesday, February 2nd, 2010

I was happy to attend the “Facebook Technology Tasting” event tonight, where they gave a presentation about their newest open source project, HipHop for PHP.

HipHop is definitely some very cool technology, built by an enthusiastic team, solving real world performance issues in large scale websites, and I have no doubt other companies using PHP (Hello Yahoo!) will find it invaluable, and hopefully help turn it into a successful open source project.

What I find most interesting and encouraging about Facebook’s most recent open sourcing efforts, HipHop today and Tornado last year, is how they are taking a dramatically different approach to their earlier open source projects like Thrift or Cassandra.

Thrift, purely as an example, was one of their first projects built internally, and later open sourced.  It originally open sourced on April 1, 2007, but it had a difficult time building a community around the code.  The approach was a blog post, and code basically ‘tossed over the wall’.  External developers did try to contribute, but I believe the interactions were less than optimal, as the original forum for discussion was a Facebook Group — they learned from this quickly,  programmers didn’t like web forums for submitting patches, and later proper mailing lists were setup.

Cassandra was another project that was essentially thrown over the wall, code was available, but there was no imitative to build a community around it.

Today, both Thrift and Cassandra found their way to the Apache Software Foundation, via independent paths.  Apache Cassandra is turning into a very healthy community, having made many releases, and is in the process of graduating to a top level project.  Apache Thrift, made their first release in December 2009, has been slowly gathering more external contributors and an open community built around the code.

What I see happening with both HipHop and Tornado is completely different, and that is what is most encouraging.  From the start, they are doing everything right to encourage an open community be built around these projects.  Open Communities are what create successful projects, and give companies creating the open source projects the most rewards.

When you create an open source project, you gain almost nothing but a PR hit if there isn’t a community built around it.  For infrastructure projects, like HipHop, Cassandra, Thrift, Scribe, and Tornado, the most important thing that gives you the most rewards from open sourcing it, is having other people hack on the code — but more than that, to use the code in their own company.

Just look at the massive community that has exploded around Apache Lucene and Apache Hadoop — Yahoo could of kept this infrastructure project internal, and sure, it might of fulfilled their original goals, but they wouldn’t of ever received the thousands of external contributions, which has turned the Lucene/Hadoop world into one of the most diverse and thriving open source communities of late, giving Yahoo a thousand times return on their investment in Hadoop.

Thank you Facebook for getting it — community is just as important as the code that you are open sourcing, and I would like to wish the HipHop for PHP developers the best of luck with their new open community project!

Released Cloudkick’s for-pay products

Monday, January 25th, 2010

I started at Cloudkick in August, and today we announced our for-pay products & Freemium model.  (TechCrunch, GigaOm, ReadWriteWeb, VentureBeat, and more)

I’ve been working along with the entire Cloudkick Team on a few parts of our launch:

  • Integration with Apache libcloud, so now Cloudkick supports EC2, Rackspace, Slicehost, RimuHosting, Linode, VPS.net and GoGrid.
  • Our new monitoring Agent, Cloudkick Agent. Extremely light weight, written in C & Lua.  Hopefully I’l have some time to blog more about some of the cool technology we did here, but we were tired of seeing monitoring agents written in High Level languages using up tons of memory on a Cloud Server.
  • Our Cloudkick Changelog Tool, aka “ckl”.  This tool lets you keep track of a large admin team and what everyone is doing.  The ASF Infrastructure team has already started using it outside of Cloudkick.  Of course, the Cloudkick UI is a little nicer than the demo one with the open source code.
  • Our new Graphing and long term trending system, built on top of Reconnoiter and Apache Cassandra.
  • Learned more about Django and JQuery than ever before.

Now that our big launch is out, hopefully I’ll find a little more time to post on this journal more.

httpd: mod_cache only caching your homepage

Monday, January 25th, 2010

mod_cache has a pretty inflexible configuration setup.  CacheEnable can only take a prefix of a path to be cached, and to disable a sub-path with CacheDisable, you need to list all of the possible prefixes (ie, no regular expressions).

Lets say you want to cache just your root page, aka ‘/’, for your website, just in case you get hit by a Slashdot Effect.

For Apache httpd 2.2.12 or newer, you can do this by first enabling Caching on All pages, then setting the no-cache enviroment variable globally, and then unsetting it for a specific path:

CacheDirLevels 2
CacheDirLength 1
CacheEnable disk /
CacheRoot /var/cache/apache2/mod_disk_cache
CacheIgnoreHeaders Set-Cookie
CacheIgnoreNoLastMod On
CacheMaxExpire 600
SetEnv no-cache
<LocationMatch “^/$”>
UnsetEnv no-cache
</LocationMatch>

For Apache httpd before 2.2.12, you need a different method of disabling caching globally, and then re-enabling it.  The easiest way is using mod_headers, to muck with Vary header

Header set Vary *
<LocationMatch “^/$”>
Header unset Vary
</LocationMatch>

Strictly speaking, doing this to the Vary header is an RFC violation, and you best bet is to upgrade to a newer httpd version.  :-)

This works because mod_cache will refuse to cache any HTTP resource with a Vary value of “*”, because this is saying that every response form the origin will be different.

httpd: disabling keep alive for hot linked images

Monday, January 25th, 2010

Lets say you are running a website, and you don’t mind people hot linking images, like your Logo, or other resources, and at the same time, you want to enable a (short) Keep Alive timeout for your normal users.

Normal anti-hot linking recipes, like the one on the HTTPD Wiki are all about disabling access to the image completely.

If you have lots of people hot linking, these users can use up valuable Keep Alive sessions, so the easiest way to solve this problem is to disable Keep Alive for just those clients viewing a hot linked image.

This is possible by using mod_setenvif and the nokeepalive environment variable:

SetEnvIfNoCase Referer (.+) nokeepalive

SetEnvIfNoCase Referer (.*)example.com(.*) !nokeepalive

What this does is first disable KeepAlive for all users that have a Referer set, and then re-enable keepalive for those users who are coming from ‘exmaple.com’, which should be replaced with your site.

I have a plan…

Sunday, December 13th, 2009

for snickerdoodles:

now its just down to execution.

crazier than expected.

Wednesday, September 2nd, 2009

Last week ended with a boom.  I was about to start upgrading wiki.apache.org to a modern version of MoinMoin when I noticed some odd CGIs running on www.apache.org.  All of that mess is just about over now.  We (royal) have put up an initial report, and just today put up a report with more details and some of the actions we are taking in response.

The coolest thing coming out of it is some motivation to finish SvnPubSub.  You can try it out right now, it is running on our svn-master:

curl -i http://svn-master.apache.org:2069/dirs-changed/xml

(data formats, URLs, etc will change, don’t use it for anything special yet)

The whole apache.org incident though has killed my time for hacking on serf, but I’ll hopefully find some time to finish the server support soon.

Other misc stuff:

  • Moving up to San Francisco in the next month, will do another post with more details later.
  • Cloudkick is going well, doing fun stuff with Python and Twisted, hopefully have more to blog about there soon.

downtime page in apache

Monday, August 24th, 2009

It is good practice to send 503 Status codes when your site has downtime or is doing an upgrade.

The easiest way to do this for all your URLs is something like this using mod_asis:

       # Bind mod_asis to files ending in .asis
       AddHandler send-as-is asis
       # Add other Aliaes/AliasMatches for any other resources needed (logos, css, etc)
       Alias /logo.png /opt/mysite/maint/logo.png
       # The magic line, pulling all matching URLs into one file
       AliasMatch /(.*) /opt/mysite/maint/index.html.asis

Your index.html.asis would contain something like this:

Status: 503
Cache-Control: no-cache
Content-type: text/html

<html><h1>My site is down, be back soon!</h1></html>

I can’t help you make a fail whale / plumber image though, that is up to you

email is awesome

Sunday, August 23rd, 2009

my favorite part of mbox_parse.c:

/**
 * List of all C-T-E Types found on httpd-dev and FreeBSD-current:
 *
 * Content-Transfer-Encoding:      8bit
 * Content-Transfer-Encoding:  7bit
 * Content-Transfer-Encoding: 7BIT
 * Content-Transfer-Encoding: 7Bit
 * Content-Transfer-Encoding: 7bit
 * Content-Transfer-Encoding: 8-bit
 * Content-Transfer-Encoding: 8BIT
 * Content-Transfer-Encoding: 8Bit
 * Content-Transfer-Encoding: 8bit
 * Content-Transfer-Encoding: BASE64
 * Content-Transfer-Encoding: BINARY
 * Content-Transfer-Encoding: Base64
 * Content-Transfer-Encoding: QUOTED-PRINTABLE
 * Content-Transfer-Encoding: Quoted-Printable
 * Content-Transfer-Encoding: base64
 * Content-Transfer-Encoding: binary
 * Content-Transfer-Encoding: none
 * Content-Transfer-Encoding: quoted-printable
 * Content-Transfer-Encoding: x-uuencode
 * Content-Transfer-Encoding:7bit
 * Content-Transfer-Encoding:quoted-printable
 *
 * This is why we have RFCs.
 */

zfs+freebsd pain

Thursday, August 20th, 2009

Having some fun times with people.apache.org:

minotaur# uname -a
FreeBSD minotaur.apache.org 7.2-STABLE FreeBSD 7.2-STABLE #0: Wed Aug  5 01:05:27 UTC 2009     root@loki.apache.org:/usr/obj/usr/src/sys/MINOTAUR  amd64
minotaur# zpool status
  pool: tank
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scrub: resilver in progress for 0h1m, 0.01% done, 266h36m to go
config:

        NAME           STATE     READ WRITE CKSUM
        tank           DEGRADED     0     0     0
          raidz2       DEGRADED     0     0     0
            replacing  DEGRADED     0     0     0
              da14     UNAVAIL      3   570     0  experienced I/O failures
              da0      ONLINE       0     0     0  6.24M resilvered
            da1        ONLINE       0     0     0  4.06M resilvered
            da2        ONLINE       0     0     0  4.15M resilvered
            da3        ONLINE       0     0     0  4.09M resilvered
            da4        ONLINE       0     0     0  4.14M resilvered
            da5        ONLINE       0     0     0  4.10M resilvered
            da6        ONLINE       0     0     0  4.15M resilvered
            da7        ONLINE       0     0     0  4.11M resilvered
            da8        ONLINE       0     0     0  4.17M resilvered
            da9        ONLINE       0     0     0  4.08M resilvered
            da10       ONLINE       0     0     0  4.13M resilvered
            da11       ONLINE       0     0     0  4.14M resilvered
            da12       ONLINE       0     0     0  4.14M resilvered
            da13       ONLINE       0     0     0  4.08M resilvered
        spares
          da14         AVAIL   

errors: No known data errors

da14 failed. we had da0, not in the array yet, so we just did:

zpool replace tank da14 da0

But now it is stuck.

It never makes progress on the Resliver.

It sure sounds like this bug:
http://bugs.opensolaris.org/view_bug.do?bug_id=6655927

But this is FreeBSD 7-STABLE from earlier this month, and it really shouldn’t be affected by that bug.

Sigh.

the comcast local monopoly sucks

Wednesday, August 12th, 2009

I pay Comcast $155.00 a month for TV and fast internet — baring the download limits I never hit, It is an acceptable service, even if still overpriced.

bank_error_in_your_favor

Yesterday I got a call from Comcast.  I didn’t think much of it.  At least how I remember it, the lady said the Auto-bill pay had made a mistake and over charged me for July.  $240 instead of $155 or something.  And they noticed this mistake, and had corrected it. Great so.. I should just be getting a credit. Woohoo money back I never noticed leaving.

Today when I woke up, the internet wasn’t working.

monopoly-go-to-jail-card

It was showing a stupid comcast activation screen.  To activate…. it wanted me to download some comcastic activation software.  Screw it, I was feeling particularly dumb this morning and I did on my older Macbook Pro.  It completely FUBARed up my network settings.  Glad I didn’t do it on my primary work machine.

Okay, well, one dead laptop later, I decided to call Comcast.

I will give Comcast some credit here, I was speaking to a Real Human Being in less than 2 minutes.  I remember being on hold for USWest/Qwest for hours when dealing with any DSL issues.

I just try to play dumb, ‘my internet has stopped working’, and the lady says it was turned off for non-payment.

Non-payment? I’ve had auto-bill pay on for 4 years, since I moved in.  I have auto-bill-pay with Comcast because I never ever ever want to deal with their crap or be on the phone with them. I WANT FAST INTERNET AND LEAVE ME ALONE.

It turns out that $242 was from someone ELSE who had paid onto MY account in June.

What happened yesterday is that Comcast reversed that payment, and since the auto-bill pay thought I had already paid for June and most of July, and now I had a delinquent charge of $242, overdue for 2 months.

So they turned off my service.

Well, turning off my service for when they fucked up sure is one way to get my attention.

So, I paid $403 to keep the internet flowing.  It isn’t like I had a choice in high speed internet.

In my rage I did change my TV plan from Digitial Rare-Metal to the basic standard plan, but I wish I could of just canceled my entire account.

If I had a choice, I would choose any other service that can offer me 20 megabit/second downloads.

AT&T in theory can offer DSL service in my area, but I don’t have a landline, and they don’t offer anywhere near 20mbit/second downloads.

Sigh.
Edit: While writing this post, Safari crashed. Twice.  :rage:

malloc debugging on OSX

Tuesday, August 11th, 2009

I can never remember all of the options for malloc(3) on OSX when debugging. So I’m posting it here so I can find it with Google Search next time I need to find it:

export MallocLogFile=/tmp/malloc.log
export MallocGuardEdges=1
export MallocStackLogging=1
export MallocStackLoggingNoCompact=1
export MallocPreScribble=1
export MallocScribble=1
export MallocCheckHeapAbort=1
export MallocBadFreeAbort=1
ulimit -c unlimited

similar malloc debugging for linux:

MALLOC_TRACE=/tmp/out.log

related: memcheck.
and more info on the suse wiki.

Joining Cloudkick

Wednesday, July 29th, 2009

cloudkick logo

I am happy to announce that I am joining Cloudkick next week.

It should be a different experience to compared to Joost; Very small company vs Medium, everyone is local vs everyone is distributed, and languages I enjoy (Python, C) vs Ones I don’t (Java, Javascript).

“Fear not for the future, weep not for the past. ” – Percy Bysshe Shelley

Leaving Joost

Wednesday, July 29th, 2009

Joost Logo

Friday July 31 will be my last day at Joost.

It was a fun ride for the last 19 months.  Starting out working on Joost’s P2P library, later on our Content Delivery over HTTP, and more recently writing Server Side Javascript inside Rhino, along with countless trips to Leiden and New York City.  Thanks to all of my colleagues who I worked with, Joost was a great experience.

As announced a few weeks ago, Joost is changing its focus to white label services.  I took a long look at this, but in the end I decided the new Joost just wasn’t for me right now in my life.

Good luck to everyone who is staying on at Joost.

My new job is with Cloudkick.

Being unique

Wednesday, June 24th, 2009

While it can be fun to be the only one in the world doing something, much more thought should be given to why no one else is doing something.

New iphone camera

Friday, June 19th, 2009

Better quality ?