posted in programming by Patrick Fitzsimmons on

"While" loops Considered Harmful

Your phone buzzes with the "server alert" ringtone. Simultaneously, your manager taps you on the shoulder: "Um, it looks like we're down." Not a single screen in your web app loads. You attempt to SSH into your server but connecting takes ... for ... ever ... Finally you get on the box and see the CPU is pegged at a 100%. You look at the access logs, but nothing looks fishy. Load is normal. Frantic and grasping for straws, you reboot the server. The server comes back up ... but immediately the CPU goes to 100% again.

An hour later you find the problem. That hour seemed like an eternity as you tore out your hair, fended off account managers, grepped every line of source you could find, and wondered what it feels like to get fired. And what was the error? It was a while loop that processed some unexpected input and spun forever.

Infinite loop bugs are among the most insdious and (outside of wiping out customer data) most destructive bugs that can plauge engineers. In web application development, most bugs only affect the particular code path where the buggy code resides. Most of the time you know what paths are most critical and can pay them the most attention. But a while loop bug in some unimportant, admin-only screen can disable the entire application for everyone. When the problem strikes, it is very difficult to figure out what exactly is causing the app to freak out and where the buggy code lives.

Fortunately, there is a solution to avoid this class of bug completely: never write a while loop.

Anywhere in your code where the need arises for a while loop, instead write a "for-loop" with a sensible maximum.

For example, let us say that I am doing a standard chunked-file read in python. In my younger days I would write:

f = open('file.mp3', 'r')
while True:
    chunk = f.read(4000)
    if not chunk:
        break
    output.write(chunk)

A better pattern is:

# pick a number comfortably higher than the logic of the function
# needs to support.
EMERGENCY_BRAKE = 10000000
for x in xrange(0, EMERGENCY_BRAKE):
    chunk = f.read(4000)
    if not chunk:
        break
    output.write(chunk)
if (x+1) >= EMERGENCY_BRAKE:
    raise Exception("File reading loop hit the emergency brake")

In the trivial case of reading a file, the lack of an emergency brake is unlikely to cause a problem. But when you start creating more complicated while loops, such as for tree traversal, streaming parsers, etc., while loops become very dangerous. I therefore highly recommend eliminating them altogether.

(Note - this blog post targets web application programmers. There are of course cases in software development where while loops are still fine. If you are writing the main event loop for a sensor on the next Pioneer space probe, your while loop should keep happily plugging away until the end of the universe).

posted in programming by Patrick Fitzsimmons on

Romania and Transylvania - Pleasant and Vampire Free

Bucharest

bucharest pleasant resized 600
Bucharest was a very pleasant city. It had some nice historical architecture, but it wasn't so over-the-top as Vienna. 


describe the image
Remants of the heavy-industry focused communist era still remain. Above is an an old, deactivated nuclear power plant.


bucharest manhole cover resized 600
Apparently law suit culture has not reached Bucharest. This is one of many open man holes in which an unsuspecting tourist, reading his guidebook while walking, could fall into and plunge towards an untimely death. Bucharest was modern (cars, wifi, iphones, etc were everywhere) but you could definitely tell it was a little behind, a little less polished, than the cities in the more western countries. My hosts told me that even in the last ten years Bucharest has modernized rapidly. 


describe the image
My host complained that the bands touring Bucharest were all washed up American bands. I saw posters for the Scorpions, Michael Bolton, Seal, etc. Despite his complaint, my host admitted that he had tickets to an upcoming concert from one of these bands.

 

describe the image

This was the only size ice cream that the shop I stopped at offered. Perhaps this is why Romanians are so much less fat than Americans.

describe the image

I wish Boston/Cambridge had cafes that were actually inside parks, where you could sip a beer or a cappucino while enjoying the greenery.

Brashov, Transylvania area of Romania

brashov downtoan resized 600
After Bucharest, I hopped on a train to the historic city of Brashov, in southern Transylvania.

 

brashov dismal sprawl resized 600
Travel pictures always make it seem like every place other than home is more beautiful and idyllic than home. But the historic town center of Brashov was actually fairly small, and most of the city now looks like more dismal, modern-style sprawl.

 

brashov palace resized 600

This is a palace built in the late 19th century for the Romanian king. The inside was absolutely stunning.   The detail was so ornate you could just gaze at one wall of a room for ten or fifteen minutes and still see new things.  Unfortunately, it was forbidden to take pictures inside, so you'll have to google, "Peles Castle" to try and find some photos.


posted in programming by Patrick Fitzsimmons on

How to Turn Your Windows Machine into Unix

For the past few years, the requirements of my job have forced me to use Windows for software development. For a long time I dearly missed features of the unix environment (ls, grep, the shell, etc). But over the years I've figured out a bunch of hacks that make using Windows a lot more tolerable. Here they are:

1) Install a real text editor - Vim or Emacs. The optimized key bindings and extra power of Emacs/Vim allow you write code or prose with amazing speed. Read Steve Yegge if you're not convinced. Put emacs in the system path, so that as you browse around in the shell you can edit files by entering ">emacs file.txt"

2) Install the GNU utils for windows and put them on your system path. Now you can use all your favorite unix utilities such as grep, touch, ls, diff, etc, from your windows command line.

3) Enable copy/paste from the command line. In the windows command prompt, click on the icon in the upper left and then click properties. Then set "Quick Edit" mode to checked. Now you can copy text by highlighting it with the mouse and hitting enter. You can paste by clicking the middle mouse button.

4) Install a better shell. I prefer IPython. IPython gives you most of the bash functionality and key bindings. Type Ctrl-r to search for previous commands. Ctrl-A to move to the beginning of the line, ctrl-k to kill a line, ctrl-y to paste/yank a line back, etc. The tab completion works like unix where it only completes until the point of ambiguity. The only annoying thing is that you must prefix all shell commands with "!". But on the plus side, you also have an interactive python prompt, which is handy if you need to test a snippet of code or do some quick math.

5) Remap Caplocks to trigger ctrl. Cap-locks is basically useless. Ctrl is used all the time in programming, especially if you use emacs. Remapping will make you type far more efficiently.

6) Forget about Cygwin. Cygwin offers the promise of a full unix shell from within windows. But the Cygwin environment has too many pieces that are broken or incompatible. You feel like you are in unix, but then you try to run a script and you get a deep down stack trace.  You spend six hours trying to debug it before finding that the underlying C-library is simply incompatible with the Cygwin environment. If you want to use linux on your windows box without dual booting, use a full VM like Colinux or VM Ware. I keep a small, headless, Colinux, Centos5 process running in the background that I SSH into if I need to do something that is linux only.

{{cta('953e27f8-c85b-4f82-b44a-3e8ef5d519fc')}}

posted in programming by Patrick Fitzsimmons on

Installing pymsssql on Vista/Windows 7 and python 2.6

Today I was trying to install pymssql using the automated installer.  When I ran python and imported pymssql I received an error: "ImportError: DLL load failed: The specified module could not be found."

Turns out that I needed a newer version of ntwdblib.dll   The folks at UserScape had made this DLL available here.  I downloaded it into my python26/Lib/site-packages folder, overwrote the existing ntwdblib.dll, and then it worked.

Googling around for the answer was not easy, so I'm posting this in case other searchers run into this problem. 

posted in programming by Patrick Fitzsimmons on

About me

I am an engineer for HubSpot, a startup in Cambridge building inbound marketing software.

Stuff I've Built

HubSpot Content Optimization System

The past year and a half I have been tech lead for the team building the next generation of the HubSpot content tools. We have built tools for creating blog posts, landing pages, emails, and full web sites. Features include visual template building, a powerful templating language, WYSIWYG editing, A/B testing, and integration with the contacts system and analytics.

The system was built in python running on django.

   

 

HubSpot Analytics and Sources Application

I lead a team building an analytics system that would allow our customers to discover where visitors so their site originated from and how they converted into customers and leads, in a set of easy to use charts and tables.

The project used python and java for the programming, and the log processing jobs ran on Hadoop.

 

HubSpot Content Management System - Version 1 ( 2006-2008)

When I joined HubSpot I took over the early prototype of the HubSpot content tools, which was a custom DotNetNuke installation running in VB on the ASP.NET stack. I converted it to C# and rebuilt the UI to make it something easy for marketers to use with a drop drop controls. I also built out features such as analytics dashboards, a drag and drop form builder, an inline content editor, and a browser based template editor.

   

 

GroupSharp 2005

My senior year at a college I founded a company to build a way for non-technical users to build and host an online database. I took the project from initial idea to the point where I was making money advertising via Google Adwords and selling subscriptions to the product. The company was "acquihired" by HubSpot.

 

Lanovision and Coffeeshop

During college I developed an application to share music and videos with other people in the dormitory.

 

 

 

posted in programming by Patrick Fitzsimmons on

Nine Javascript Gotchas



1) Comma Caused Coruption




<script>
  var theObj = {
        city : "Boston",
        state : "MA",
  }
</script>

Notice the comma after "MA"?  It will be the source of many woes.  Firefox will pay it no heed, but it will create a syntax error in IE.  Worst of all, IE will not tell you where the actual bug is.  The only soution is to scan through your entire 2,500 line javascript file trying to find that extra comma.



2)  "this" can change which object it's pointing at



Take a look at this code sample.

<input type="button" value="Gotcha!" id="MyButton" >
<script>
var MyObject = function () {
    this.alertMessage = "Javascript rules";
    this.ClickHandler = function() {
          alert(this.alertMessage );
      }
}();
document.getElementById("theText").onclick =  MyObject.ClickHandler
</script>


The function MyObject.ClickHandler will actually give a different alert, depending on how it's being called.

If you call MyObject.OnClick(); you will get a popup saying "Javascript rules". 

However, if you click on the button "MyButton", the popup will say "undefined"

When you assign MyObject.OnClick to the even handler, the special variable "this" now referes to the button, not to MyObject.

There are several ways to refer to MyObject.  My favorite is to introduce the "self" variable as a replacement for "this":

<input type="button" value="Gotcha!" id="theText" >
<script>
var MyObject = function () {
    var self = this;
    this.alertMessage = "Javascript rules";
    this.OnClick = function() {
          alert(self.value);
      }
}();
document.getElementById("theText").onclick =  MyObject.OnClick
</script>

Now the alert will say "Javascript rules" no matter what you call. The variable "self" will always refer to MyObject.

Note there is one more gotcha with the above code.  Do not forget the "var" in "var self" or IE will throw a mysterious exception.


3) Identity Theft



Never name a variable the same as an HTML ID:

<input type="button" id="TheButton">

<script>
    TheButton = get("TheButton");
</script>

This will work fine in Firefox but cause and object undefined error in InternetExplorer


4)  String replace only replaces the first occurrence



You might have a code to turn a title into a URL slug:

<script>
    var fileName = "This is a title".replace(" ","_");
</script>

To your chagrin, fileName is actually equal to:
    "This_is a title"

Unlike replace in other languages such as C# or Python, only the first occurence is replaced.  That's because the first argument to replace is actually a regular expression.

To replace all occurences, you need to set the global modifier. Use:

    var fileName = "This is a title".replace(/ /g,"_");

 


5)  MouseOut sometimes means MouseIn



When you have nested div's, the onmouseout event will fire for an outer box when you move inside the inner box.  For context menu's or hover overs, this is not the desired behavior.  My solution is to test for the mouse's location, and only take action if the mouse is actually positioned outside the outer box.

6)  ParseInt scoffs at your base ten numbering system



ParseInt is really nice, because it works with strings that are not pure digits.  I always find myself doing the following:
var height = parseInt("200px")

and get the height.

However, the default call to parseInt has a problem.

Guess what the value of monthInt will be:
  month = "09"
  var monthInt = parseInt(month )

If you guessed 9, then gotcha!  The answer is 0.

When the string begins with a zero, parseInt interprets the value in base 8.  To fix this problem, do the following:

var monthInt = parseInt(month , 10);

Now monthInt will be equal to 9.  The second argument forces parseInt to use a base ten numbering system.




7)  for loops over the kitchen sink



I once had an array as such:
var arr = [5,10,15]
var total = 1;
I iterated over the array:
for ( var x in arr) {
    total = total * arr[x];

}

This piece of code worked fine, until one day, I was getting error.  The error said, "Cannot object  by a number.  This flummuxed me since there were no strings in the array.  But, lo, when I iterated over the array and logged each value, there was an indeed a function object called "find". 

The cause was a javascript library that we had recently installed.  This library added a "find" method to the javascript array object.  Nice to have, but I the "for" loop in javascript will iterate it over all object attributes, including functions. 

But fortunately, the object was not included in the "length" attribute.  Thus to fix the problem, I used another kind of for loop:

for ( var x = 0; x < arr.length; x++) {
    total = total * arr[x];

}

That worked perfectly.



8)  Event handlers Pitfalls


Never set event handlers like the following:

window.onclick = MyOnClickMethod

1)  This will overwrite existing events.  It opens up the possibility of overwriting by some other javascript
2)  This can introduce memory leaks in Internet Explorer in certain circumstances. 

Instead, use a library that abstracts around the event handler, like YUI:

YAHOO.util.Event.addListener(window, "click", MyOnClickMethod);


9)  Focus Pocus



Often when I want to add inline editing to my app, I create a text field and then focus on it:

var newInput = document.createElement("input");
document.body.appendChild("newInput");
newInput.focus();
newInput.select();

However, the above code will create an error in IE.  The reason the even though you have added the element, it is not really available yet.  Fortunately, a split second delay is all we need:

var newInput = document.createElement("input");
newInput.id = "TheNewInput";
document.body.appendChild("newInput");

setTimeout("document.getElementById('TheNewInput').focus(); document.getElementById('TheNewInput').select();", 10);



posted in programming by Patrick Fitzsimmons on

Seven Startup Lessons from Intuit

Recently I was reading Inside Intuit, a book about the rise of the company that makes Quicken and Quickbooks.  Here are the take home points.  Not all of these lessons are universal, but hopefully you will find them useful.

1)    Business skills and technical skills are equally important.  
Techies and suits can remove their hands from each other's throats and work together.  Intuit founder Scott Cook had the stereotypical business pedigree:   Harvard MBA, Proctor and Gamble Product Manager, and Bain Consultant.  Partner Tom Proulx was a 22 year old Stanford programming whiz.   Cook used his business skills to do systematic market research and create a clearly defined product plan.  While Intuit's techie-run competitors crammed in every possible feature to their product, Cook designed Quicken to make it super easy for the home user to balance a check book.  Cook also formed distribution partnerships with banks and devised innovative advertising campaigns. 

Equally important to Intuit's success were Proulx's technical achievements.  When the team realized that most people had trouble printing checks on their dot matrix computers, Proulx spent all night hacking up a way for the program to automatically alert the user if the printer paper was not installed right.  This ultimately earned Intuit  a patent.    Proulx worked long hours and programmed the first two versions of Quicken almost entirely by himself. 


2)   Learn from non-technology businesses. 
Founder Scott Cook previously worked as a brand manager for Proctor and Gamble, a big cereal maker.  He decided to imitate breakfast cereal marketing and package Quicken in a bright orange box so that it would stand apart on the shelves.  This was very different than the boring, business-like packaging of the time, and it helped Quicken sell.


3)     Your competitor is not other companies, but the way that things are done now.   Forty-six companies sold accounting packages at the time Quicken entered the market.   But Scott Cook found that Intuit's real competitor was paper and pencil.  None of the other packages could balance a checkbook faster than could be done by hand.    That was the real challenge that Quicken faced - not competitors.

4)   Talk to as many potential customers as you can, from the very beginning. 
After Scoot Cook thought of the idea for Quicken, he picked up a phone book and called a hundred people at random.  These conversations not only told Cook that he was onto something, but they also taught him to deeply understand people's check balancing habits and difficulties.  It was these calls that told him what the the three most important features of the product should be.



5)   Focus your product on the absolute essential user needs. 
Intuit's competitors crammed their products with every imaginable feature.  The resulting monsters were so complex that their own engineers could not even print a check.  By calling and talking to hundreds of people, Cook learned that there were three tasks that people that people wanted to accomplish:  paying bills, maintaining a record of checks, and reviewing expenditures.  He focused everything on making these three tasks incredibly easy.  He routinely pulled pedestrians off the streets of Palo Alto into their office and would sit with a stop watch as they tried to use the software to print a check.  The average user took 7 minutes to find the enter key on the keyboard (remember, this was the early 1980's), but they still managed to print a check within 15 minutes.

6)  Even future billion dollar companies will teeter on the brink of defeat.  
"This isn't a layoff," Cook told his employees, "I 'd like everyone to keep working.... we just can't pay you right now."  Intuit's bank account was down to a mere $385.  Cook told his employees that he would like to talk to each of them individually, but that they would have to wait because, at the moment, he had to meet with his marriage counselor.  His wife was distraught because their entire savings was disappearing into Cook's startup. 

Fortunately for Intuit, the lead developer, Tom Proulx agreed to stay on board working only for stock.  The company juggled which bills they would pay and which they would delay.  They used packing boxes as desks.   They endured and the tide began to turn.   The Apple II version was far more popular than the IBM version.  Several banks finally came through on the partnership deals.  A magazine gave the software a great review.  The cash came in, and Intuit was breathing again.


7)  Do right by your customers
Disaster struck.  Customers from across the country were calling in complaining that the latest version of Quicken was freezing when they were trying to save their data to a floppy disk.   Intuit decided that all 20,000 customers needed a new version, and needed it as soon as humanly possible.  In the days before the Internet, that meant creating 20,000 disks in three days.  That was 8,000 more than their supplier could provide.  So the entire staff got to work, gerry-rigging computers into the positions where they could copy disks using both their fingers and their toes.   After that came a massive envelope stuffing operation.  Within a week, they mailed out 20,000 new disks to all their customers.  Their customers loved the rapid response.  Their customer satisfaction was actually higher than before the bug occurred.



posted in programming by Patrick Fitzsimmons on

How to bring Internet freedom to China

We commonly condemn China's draconian controls on Internet usage. But amazingly, there is a way that we in the United States can give everyone in China Internet freedom. Currently, it is fairly easy to set up a proxy server that allows a Chinese web surfers to access the Internet as if he or she was living in America. The problem with proxy servers is that if the server becomes popular, the Chinese government will block the web address of the server and punish those trying to use it.

The solution is to host the web server on the domain of a critical business site. A huge number of Chinese businesses run off of Microsoft services. When I spent time in China, most of the people I met used Microsoft Hotmail for their email - work and personal. Block Hotmail and the government cripples tens of millions of people going about their daily business. Other business critical web sites include American corporate sites, such as Microsoft Update, Google, PayPal, and Chase Manhattan Bank. The government cannot ban those services without suffering severe economic repercussions.

An of Congress should require that all major US companies doing business with China host proxy servers on their domains. A person in China would simply go to https://www.hotmail.com/FreeInternet/ and be able to access the entire web without restriction. By using the secure "https" protocol, the "FreeInternet" part of the address would be encrypted. Thus the Chinese government would have absolutely no way of detecting whether the user was using the free internet or sending a business email. Chinese citizens would be completely free to write blogs, read foreign news, and engage in political discourse all with complete security and anonymity.

Even without an act of Congress, we in free countries can create these proxy services by convincing the right organizations to host the servers on their domains. Domains such as apache.org or mozilla.org would be very difficult for China to block without doing severe damage to their software industry. We could build a movement to host proxy servers on these types of domains.

I'd love to hear your feedback about this proposal. If you think it's a good idea, please spread it around. We can start a movement to bring Internet freedom to China.

posted in programming by Patrick Fitzsimmons on

Offline Advertising Still Rules

Marketing Sherpa has a fascinating report out on online versus offline advertising spending in 2006.  It turns out that old school advertising is still way ahead of this new internet thing.

Print newspapers and magazines have been taking a beating from the Internet and blogs, but they are still way ahead of all online spending in advertising revenue.  Spending on newspaper advertisements was $30 billion - 4 times the spending on paid search advertising.  Magazine ads were at $23 billion.  That's almost 6 times the spending on all online, non-search advertising.  .

One thing that might skew the numbers is that advertising costs so much less on the Internet.  Direct mail is the greatest expenditure at $59 billion.  But the online equivalent - email marketing - is virtually free. Thus even if there is nearly as much email marketing, it won't show up in the expenditure figures.  Another example of this asymmetry is Craigslist, which is eliminating the entire classifieds industry by making postings free.

While some might view this as a bucket of cold water poured on the Web 2.0 hype, I see this as a huge opportunity.   More and more content is going online, but advertising is not keeping up.  This is despite the fact that advertising online can be far more targetted and effective than offline.  Google, for example, can often get higher than $1,000 per CPM on it's more competitive search terms.  There is still a huge opportunity for companies in the space of analytics, ad targetting, and ad distribution.

posted in programming by Patrick Fitzsimmons on

Joining HubSpot

I am pleased to announce that I'm joining HubSpot.  It's a Cambridge, MA startup building an Internet Marketing platform.   GroupSharp will converted to their software platform, and will become the basis of the CRM system integrated with the platform.

pfitz-face-formal-croppedAbout me
I am a startup guy and software engineer.  I work my craft at HubSpot, a startup in Cambridge, MA that builds inbound marketing software.  In the HubSpot early years I had to rapidly crank out prototypes as we figured out how to make a product that customers wanted.  Now I help design larger scale systems to serve our nearly ten thousand customers.

Posts by topic

Subscribe to Email Updates