Code with The Man

Saturday, May 22, 2010

GoRuCo Pace U. - Session 3

"Managing Ruby Teams" - Luke Melia

First thing's first: what makes rubyists different?
- lots of independent thinking
- all open source with no dominant conglomerate running frameworks, dev techniques, etc... (see C#, Java)
- rubyists constantly embrace change (for example, Rails has 1500 contributors and consistently breaks backward compatibility - and everyone seems cool with this).

The Key Principles:
- self-organizing team (core of XP practices); this extends to empowering the team to maximize its own effectiveness.
- personal growth: simply being successful on a team, for most developers, is not enough. So a manager's job is to make sure they grow as professionals while working on the team.
- hire well (Cyurs: check)
- servant leadership - Jess Hottenstein may actually know what he's talking about.
- inspirational leadership

Practices:
1) Sit with your team. Listen and observe, but think twice before interrupting. Let the team try to hash out their issue.
2) Management by coding around. Pair program with different people, walk around the team's area and observe information radiators. Observe the "key feedback loop."
3) Tools and workspace. Share editors (if possible), tools, scripts, etc... amongst the entire team. And, obviously, sit together in an open area.
4) A Policy for Policies. You will have to put some rules in place but try to put in as few as possible. Fit each policy to the frequency and severity of the target problem.
5) Retrospectives and Kaizen. Luke doesn't distinguish between the two, but I think both are separate and important activities.
6) Delegation. Identify discreet, repetitive tasks and automate them. When you can't automate, delegate. They should be shared amongst the team. (an example from my team is tracking and troubleshooting (if necessary) the twice-daily push to our staging server).
7) The Andon Cord. The stop-the-line cord. (This is from Toyota Production System). Give your team this power. (And definitely stop the line if the build breaks!)
8) Be patient. Be aware of the forming, storming, norming, performing cycle.
9) One-on-ones. Simply a meeting between the project lead and the team member. Make the time pre-planned to privately address concerns and questions. You can also take this time to give feedback.
10) Motivation. Understand what motivates each employee. It can vary widely. For rubyists, don't underestimate the "mastery" motivator.
11) Feedback. Learn how to deliver it effectively. On our project we are actively trying to become better at giving each other feedback. It's not easy, but it really promotes a healthy team.
12) Open Source. Use it; share your changes. Using git submodules is a great way to contribute to open source while still addressing the team's concerns.
13) Hiring. Do it like Cyrus does it (Note: This is not exactly what Luke said). But one interesting thing that we don't do is "get 100% of the team on board." This can be tricky; we don't always know where the new hire is going or what team they'll be on. But that doesn't mean it's not a good goal to shoot for.
14) The Us-Them Relationship. Fight hard against the "our team versus the company/other project teams" relationship. Cultivate it externally - carefully. This means looking into the community to find other teams or projects that your team can learn from.
15) Be honest. Don't lie to our team. Share your passion. This is the 15th bullet point, but Luke makes it sound like the most important one. "Destroy trust, and the whole thing comes tumbling down."

Check out the reading list at: http://bit.ly/gorucobooks for some more info. Also Manager Tools, Agile Executive, and Agile Toolkit are all podcasts that helped Luke become a better manager.
15)

GoRuCo Pace U. - Session 2

"memprof: the ruby level memory profiler" - Aman Gupta

Addressing the ruby GC is a way to get Ruby to run faster. (Aman is speaking of MRI Ruby, which is the most commonly one used.)

When the GC is running, it "stops the world." None of your code runs while this happens. The GC is conservative, which makes it difficult to tweak. It also utilizes a design decision called "mark and sweep" which is a less-than-optimal one.

To improve performance:
- avoid leaked references: you create an object and you don't need it anymore, but you're still holding onto the reference - the GC will not clean it up but always has to look at it.

- create fewer objects: obviously

The tools:

1. ObjectSpace.each_object: When it's sorted and printed out you can get a sense of what's going on in the VM. An example:

types = Hash.new(0)
ObjectSpace.each_object do |obj|
types[obj.class] += 1
end

pp types.sort_by{ |klass, num| num }

2. gdb.rb: gdb hooks for REE. Written by Aman. It's "a bunch of python code" that understands how your ruby code works. gdb can find and fix leaks and has a number of different processes.

3. bleak_house: installs a custom patched version of ruby and tells you what is leaking and WHERE the leak is happening. The drawback is that you have to reinstall ruby and all of your gems. Aman also wrote a patch that includes more verbose information about the leak and where it's happening. It is a "heap dumping" patch.

4. memprof. ease of use, detailed information, and simple analysis. Thus it allows processing via various languages and databases using simple JSON data format. You can follow all the under-the-hood specifics on www.timetobleed.com.

MemProf.track: like bleak_house but only for a given block of code (not the entire code base). So we can drill down and get more specific into areas of your code base.

** You'll want to try these tools out in production mode. Development mode has more overhead and your results will be skewed **

memprof.com: a web-based visualizer and leak analyzer. Aman used it to find a leak in the development mode of Rails 3 beta.

Friday, May 21, 2010

GoRuCo Pace U. - Session 1

"Grease Your Suite: Tips and Tricks for Faster Testing" - Nick Gauthier

Nick uses, Rails, Shoulda, Factory Girl, and Paperclip. He favors Factory Girl over fixtures, as do I. Shoulda is interesting because it can be integrated with Test::Unit but still give it a BDD feel. I've used it a bunch of times but the jury is still out for me.

In his "vanilla" test suite, the one with standard unit and functional tests (and probably other), takes 13 minutes, 15 seconds.

Implementing Shoulda with nested contexts took the test time down to about 5 minutes.

Using Paperclip saves time during functional testing. By using Paperclip to return negligibly sized images during

parallel_tests, specjour, deep-test and tickle are tools to fork your testing process across multiple cores. According to Nick, these are good tools but they don't do the "balancing" across the cores very well. parallel_tests is for Test::Unit. It requires more setup in that you have to have multiple test databases running.

But Nick recommends hyrdra. It will run with Test::Unit (still tops in my mind) and doesn't require multiple db setup. On top of that, it has "smarter" balancing across multiple cores. It will also run with cucumber.

Some background on environment loading. Test::Unit will actually load the env 4 times. Cucumber does 2 and RSpec tests just does it once. Hydra also only boots the env once even for multiple test frameworks.

Hydra brought the testing time back to 1 minute, 26 seconds.

The default behavior for filesystem mounting on most 'nix systems is journalling enabled. This means that all ata is committed into the journal prior to being written into the main filesystem. Another option is journal_data_writeback. This means that data may be written into the main filesystem after its metadata has been commited to the journal. This will increase speed, but at the cost of total safety of your metadata. You don't care about this in your tests, so try it. But make sure you don't take this to production!

atime - totally useless (according to Nick) so turn it off in your filesystem.

After this the total test time is only 50 seconds.

Next, Nick switched to using RVM to install Ruby EE. He then implemented tcmalloc. He bumped up RUBY_HEAP_MIN_SLOTS as a setting on ruby EE from the default (10,000) to 1,000,000. This says "take a huge chunk of memory" to run our tests with. He bumped RUBY_GC_MALLOC_LIMIT up to 1,000,000,000. RUBY_HEAP_FREE_MIN went from 4,096 to 500,000. You can simply set this in your profile for testing.

After this, the total test time is 18 seconds. Quite a jump! And it looks relatively simple to implement. Right now, it only takes our project about 2 minutes to run our test suites. But once we get into Cucumber, time is going to increase rapidly.

Friday, April 9, 2010

Philly ETE, Day 2: Session 5

Chef: Saving Time (and Money) with Automated Provisioning

-- Trotter Cashion

Trotter starts by talking about his own user experience with automating deployment. His company looked at Chef, decided it would be too complicated, and decided to create the automation themselves using Bash. It was a big win at first, but over time it became way too difficult to maintain, adding new machines was time consuming, and there were still too many manual tasks involved.

So eventually they moved to Chef.

Chef is...

written by Opscode

written in Ruby

in some sort of controversy with Puppet

driven by a Ruby DSL

Chef has a concept of 'cookbooks.' It's a set of instructions to install any piece of software that are published by Opscode. It can also keep machine and code in lock-step. So you can deploy your software for any environment ("QA", "performance," "development") and configuration may well vary among them. Chef helps you keep all of the necessary elements by organizing all of the environments with a particular set of cookbooks.

One main difference between Capistrano and Chef is that Chef eschews the concept of SSH. Chef assumes that it is already on the box on which it is installing software. It also assumes that it has root access.

According to Trotter, Chef provisioning takes between 15 and 30 minutes and deploys take under 2 minutes.

Using Chef with Spatula:

git clone http://github.com/opscode/chef-repo

gem install spatula (this is Trotter's tool)

spatula prepare db-server.yourcompany.com

spatula install my_database #this will look up the appropriate cookbook and install

...Rest of instructions are on the slides and I don't want to copy them word for word. They should soon be up on the Philly ETE Site.

The Chef Directory is really simple. It contains 'config,' 'cookbooks', 'roles', and 'custom-cookbooks' subdirectories. The cookbook directory contains recipes, files and templates (static or dynamic files to copy to places on your machine), attributes, and some others.

Trotter says the best possible place to start learning Chef is on their website, though not necessarily at the home page. Start with http://wiki.opscode.com/display/chef/Resources. This has the most relevant definitions and lots of example code to get you started.

Philly ETE, Day 2: Session 4

Barbara, Demeter, and Don: notes on some CS precepts from a non-scientist programmer

-- David A. Black

This talk will focus on three ideas: Liskov Substitution Principle (Barbara), the Knuth "premature optimization" principle (Don), and the Law of Demeter.

Liskov: "Let $q (x)$ be a property provable about objects $x$ of type $T$ . Then $q (y)$ should be true for objects $y$ of type $S$ where $S$ is a subtype of $T$ ."

How about some psuedocode?

type Bicycle {

attribute "wheels" = 2

}

type Tricycle inherits from Bicycle {

attribute "wheels" = 3

}

That's a simple example of a violation of the LSP because the Tricycle changes the core attribute of its parent. The Ruby way of looking at LSP is to eschew a heavy reliance on inheritance hierarchy.

type != class in Ruby. So what is an object's type? "For any Ruby object obj, the type of obj is: the type that objects of the type that obj is of are of." Glad we cleared that up. David borrows the term "stereotyping" and repurposes it to mean trying to determine what an object's ancestry in order to determine if it is suited to a certain task. This is the wrong approach. Duck typing is more effective, clean, and in keeping with 'the Ruby Way.'

In Ruby, David points out, we only care about objects and what they can do 'in the moment.' We want to send them a message and have them respond properly at a particular point in time. So inspecting the object's ancestry is unnecessary. Duck typing will tell you whether or not the object is suited to task.

So is Ruby compliant with LSP? We talk about the issues of type and object substitution frequently as Ruby programmers and we do it in a way that may or may not be orthogonal to LSP. I will be exploring this further as I continue through my Uncle Bob Payroll Case Study exercise.

Knuth:

"We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil." -- The full 'premature optimization' quote from Knuth and/or Sir Tony Hoare.

David states it straight out: this quote gets misused frequently. His first statement in an attempt to ground us is "not all optimization is premature." He also uses an example of the 'match' method versus the '=~' method to illustrate. Why should we feel hesitant or anxious over choosing one or another based on performance. Doesn't this fall into the 97% of small efficiencies? Probably. Other problems exist with the syntax of the statement itself. 'Optimal' is akin to a superlative. Maybe we should try to incrementally 'ameliorate' your code instead.

Demeter:

"The goal of the Law of Demeter is to organize and reduce dependencies between classes. Informally, one class depends on another class when it calls a function defined in the other class."

The example used is a Ruby implementation of an archiving program. Then, interestingly enough, he shows us something that usually is (wrongly) seen as a violation of the Law of Demeter:

a = Adder.new

a.add(3).multiply(6).add(1).subtract(5)

puts a.sum

Since this object chains methods that all call itself, it is not by definition a violation of the Law of Demeter. A real violation involves one object reaching into another object in a way that unnecessarily couples the two objects. David then showed an example of a clear Demeter violation that did not do any "dot" method chaining.

http://haacked.com/archive/2009/07/14/law-of-demeter-dot-counting.aspx is a post recommended by David to further explain how Demeter violations are not just identifiable by counting dots.

Philly ETE, Day 2: Session 3

Demystifying HTML5: Understanding the Emerging Web

-- Molly Holzschlag

The history: WHAT-WG is a company that broke away from the W3C because they did not believe in the future of XHTML. Now, of course, XHTML is dying off and is no longer supported. However, what comes out of that group is not considered an "open standard." The only official open web standards are issued by the W3C.

Today, W3C says they are committing future resources to HTML5 and are no longer supporting XHTML.

Google, Microsoft (!!!), Mozilla, Opera, and WebKit all, for the first time ever, believe in HTML5 and want to see it succeed. This may be the first time in the web's history that this ubiquitous acceptance happened.

Lots of companies are already jumping on board -- Netflix, MySpace, YouTube.

HTML5 is an attempt to advance the web's core language and to embrace a rich, interactive, forward-thinking web. It is also being designed to support full backward-compatibility. HTML5 will replace XHTML 1.0, DOM2HTML and, of course, all previous versions of HTML.

One of the core design principles for HTML5 is explicit specification for error handling and more graceful degradation (for example, if you deploy to an older browser that does not support HTML5, the application should fail gracefully).

The W3C's objective is to evolve HTML rather than recreate it. They are trying to avoid reinvention. They are also trying to build on real-world use and test cases.

HTML5 Syntax:

There is no documentation type definition (DTD). So there is no special declaration for version 5.

HTML syntax is still served using text/html. However, you can use HTML5 with XML syntax and XHTML 1.0.

Of course, when we implement in XML notation we must close our tags the proper way (of course, we should always be rigorous and write well-formed HTML as well as XML).

The W3C will soon support SVG universally (they're waiting on -- shocker -- IE to build support into IE9).

There are all new elements that Molly runs through with us. You should read her slides to get the full list of new elements and definitions. She said she will soon post them on the Philly ETE site.

'Input' is a particularly interesting new element. It allows us to to declare and execute embedded scripts in a more terse, expressive manner without all the clunkiness of the way we declare them now.

'Require' is another great new element. It allows us validate form input without any using any scripts! That's pretty darn cool.

Embedded media is also coming. Such as canvas (the HTML5 drawing API), video, and audio (embedded audio and video without the reliance on plug-ins). These are works in progress. Even further down the road (but still realistic future features) are localStorage (client-side data storage that is persistent across sessions and uses client-side SQL storage) and applicationStorage (enabling offline applications).

One of the best places, believe it or not, to find information about web standards and HTML5 is Wikipedia. So feel free to check it out there or, of course from the horse's mouth at www.w3.org

Philly ETE, Day 2: Session 2

Hardwired for deception means trouble with estimates!

-- Linda Rising

My first experience of Linda Rising was through her fabulous book, "Fearless Change." That is the most inspiring book I've ever read about organizational change and is packed with strategies and patterns for doing so. Ariel and I continue to use the language developed in the book to talk about our own work in affecting change on our client.

She's here today to give us "some bad news" and tell us "some bad things about" us. Much of the information she's going to present will be too difficult for us to absorb and the natural reaction will be to "explain it away."

I'm strangely intrigued.

Linda's definition of deception: "consciously or unconsciously leading another or yourself to believe something that is not true." We'll start here. Her claim is that we naturally deceive ourselves or others - constantly. So it's a natural extension of this to conclude that we are constantly screwing up our estimates.

We are hardwired to deceive. We are hardwired to be optimistic, to see what we want to see. Then we rationalize away any incongruent results between our optimistic choice and reality. We are all biased, prejudiced, etc... There is no way around this. According to Linda, we cannot do anything about it.

Along with the confidence in your own intelligence comes the illusion of rationality. But you are not rational and you are certainly not more or less rational than anyone more or less intelligent than yourself. In fact, the smarter you are, the more you deceive yourself and the more clever your rationalizations become.

One example goes back to a scientific discovery made in 1847 that doctors who washed their hands between performing autopsies and delivering babies saved more lives of women an infants. But this discovery did not affect many doctors' decisions to wash their hands. They continued their non-hand-washing practices even in the face of scientific evidence.

"A new scientific truth does not triumph by convincing its opponents and making them see the light, but rather because its opponents eventually die, and a new generation grows up that is familiar with it." -- Max Planck. Wow, that is grim.

Back to the world of everyday work. How about hiring? In an experiment, a number of interviewers were given the same information for a job candidate. Some interviewers were given some 'positive' information about the candidate first, and the others were given somewhat 'negative' information first. Those interviewers who received positive information first about a candidate gave significantly higher predictions of success. In fact, this tendency was exaggerated when the element of time pressure is added.

More stats on deception: On average, during a typical conversation there are 3 lies every 10 minutes. A survey of college professors revealed that 93% believed they were above average. And so on.

OK, this presentation is really about the real world. The practice of deception is passed onto our children. We teach them to deceive in a socially acceptable manner (i.e., pretend to like every present you get from family members on Christmas). We eat more from larger containers, when served larger portions, or when at an all-you-can-eat buffet. Yet we deceive ourselves by underestimating how much we eat. Sedentary people (such as programmers) show an increase in the hormone ghrelin, which increases appetite. So internally, even your own body deceives you into thinking you need to eat more than you actually do.

Getting back to our software story estimates, we can now draw conclusions about our (in)ability to estimate. Linda used to believe that mathematical models and tons of data from past projects would point the way to better estimates. But she saw that she was wrong. In her experience she was never able to make better estimates based on any of her tool, models, or truckload of data.

So should we even bother? Should we build in some mechanism for tuning our estimates to a less optimistic number? Linda has one suggestion: stop using numbers. Estimates represented by numbers are inherently deceptive. They give the illusion of being something that can be added, subtracted and multiplied. But estimates do not have these properties by nature. So getting away from numbers will alleviate some of the tendency to deceive oneself and others into confusing an estimate with a calculation. "Use t-shirt sizes or gummy bear colors," she says. Anything but numbers.

Linda's goal is to get us thinking about this and for me she has succeeded. And there has to be more than just estimating that is getting colored by our constant deception of ourselves and others - interviewing practices (as noted above), team dynamics, and decisions made during pairing are a few that spring to mind.

Crap! I just realized Cyrus is giving out estimating decks at our booth. They have numbers all over them! I must stop them! I'm off!