Code with The Man

Friday, April 9, 2010

Philly ETE, Day 2: Session 5

Chef: Saving Time (and Money) with Automated Provisioning

-- Trotter Cashion

Trotter starts by talking about his own user experience with automating deployment. His company looked at Chef, decided it would be too complicated, and decided to create the automation themselves using Bash. It was a big win at first, but over time it became way too difficult to maintain, adding new machines was time consuming, and there were still too many manual tasks involved.

So eventually they moved to Chef.

Chef is...

written by Opscode

written in Ruby

in some sort of controversy with Puppet

driven by a Ruby DSL

Chef has a concept of 'cookbooks.' It's a set of instructions to install any piece of software that are published by Opscode. It can also keep machine and code in lock-step. So you can deploy your software for any environment ("QA", "performance," "development") and configuration may well vary among them. Chef helps you keep all of the necessary elements by organizing all of the environments with a particular set of cookbooks.

One main difference between Capistrano and Chef is that Chef eschews the concept of SSH. Chef assumes that it is already on the box on which it is installing software. It also assumes that it has root access.

According to Trotter, Chef provisioning takes between 15 and 30 minutes and deploys take under 2 minutes.

Using Chef with Spatula:

git clone http://github.com/opscode/chef-repo

gem install spatula (this is Trotter's tool)

spatula prepare db-server.yourcompany.com

spatula install my_database #this will look up the appropriate cookbook and install

...Rest of instructions are on the slides and I don't want to copy them word for word. They should soon be up on the Philly ETE Site.

The Chef Directory is really simple. It contains 'config,' 'cookbooks', 'roles', and 'custom-cookbooks' subdirectories. The cookbook directory contains recipes, files and templates (static or dynamic files to copy to places on your machine), attributes, and some others.

Trotter says the best possible place to start learning Chef is on their website, though not necessarily at the home page. Start with http://wiki.opscode.com/display/chef/Resources. This has the most relevant definitions and lots of example code to get you started.

Philly ETE, Day 2: Session 4

Barbara, Demeter, and Don: notes on some CS precepts from a non-scientist programmer

-- David A. Black

This talk will focus on three ideas: Liskov Substitution Principle (Barbara), the Knuth "premature optimization" principle (Don), and the Law of Demeter.

Liskov: "Let $q (x)$ be a property provable about objects $x$ of type $T$ . Then $q (y)$ should be true for objects $y$ of type $S$ where $S$ is a subtype of $T$ ."

How about some psuedocode?

type Bicycle {

attribute "wheels" = 2

}

type Tricycle inherits from Bicycle {

attribute "wheels" = 3

}

That's a simple example of a violation of the LSP because the Tricycle changes the core attribute of its parent. The Ruby way of looking at LSP is to eschew a heavy reliance on inheritance hierarchy.

type != class in Ruby. So what is an object's type? "For any Ruby object obj, the type of obj is: the type that objects of the type that obj is of are of." Glad we cleared that up. David borrows the term "stereotyping" and repurposes it to mean trying to determine what an object's ancestry in order to determine if it is suited to a certain task. This is the wrong approach. Duck typing is more effective, clean, and in keeping with 'the Ruby Way.'

In Ruby, David points out, we only care about objects and what they can do 'in the moment.' We want to send them a message and have them respond properly at a particular point in time. So inspecting the object's ancestry is unnecessary. Duck typing will tell you whether or not the object is suited to task.

So is Ruby compliant with LSP? We talk about the issues of type and object substitution frequently as Ruby programmers and we do it in a way that may or may not be orthogonal to LSP. I will be exploring this further as I continue through my Uncle Bob Payroll Case Study exercise.

Knuth:

"We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil." -- The full 'premature optimization' quote from Knuth and/or Sir Tony Hoare.

David states it straight out: this quote gets misused frequently. His first statement in an attempt to ground us is "not all optimization is premature." He also uses an example of the 'match' method versus the '=~' method to illustrate. Why should we feel hesitant or anxious over choosing one or another based on performance. Doesn't this fall into the 97% of small efficiencies? Probably. Other problems exist with the syntax of the statement itself. 'Optimal' is akin to a superlative. Maybe we should try to incrementally 'ameliorate' your code instead.

Demeter:

"The goal of the Law of Demeter is to organize and reduce dependencies between classes. Informally, one class depends on another class when it calls a function defined in the other class."

The example used is a Ruby implementation of an archiving program. Then, interestingly enough, he shows us something that usually is (wrongly) seen as a violation of the Law of Demeter:

a = Adder.new

a.add(3).multiply(6).add(1).subtract(5)

puts a.sum

Since this object chains methods that all call itself, it is not by definition a violation of the Law of Demeter. A real violation involves one object reaching into another object in a way that unnecessarily couples the two objects. David then showed an example of a clear Demeter violation that did not do any "dot" method chaining.

http://haacked.com/archive/2009/07/14/law-of-demeter-dot-counting.aspx is a post recommended by David to further explain how Demeter violations are not just identifiable by counting dots.

Philly ETE, Day 2: Session 3

Demystifying HTML5: Understanding the Emerging Web

-- Molly Holzschlag

The history: WHAT-WG is a company that broke away from the W3C because they did not believe in the future of XHTML. Now, of course, XHTML is dying off and is no longer supported. However, what comes out of that group is not considered an "open standard." The only official open web standards are issued by the W3C.

Today, W3C says they are committing future resources to HTML5 and are no longer supporting XHTML.

Google, Microsoft (!!!), Mozilla, Opera, and WebKit all, for the first time ever, believe in HTML5 and want to see it succeed. This may be the first time in the web's history that this ubiquitous acceptance happened.

Lots of companies are already jumping on board -- Netflix, MySpace, YouTube.

HTML5 is an attempt to advance the web's core language and to embrace a rich, interactive, forward-thinking web. It is also being designed to support full backward-compatibility. HTML5 will replace XHTML 1.0, DOM2HTML and, of course, all previous versions of HTML.

One of the core design principles for HTML5 is explicit specification for error handling and more graceful degradation (for example, if you deploy to an older browser that does not support HTML5, the application should fail gracefully).

The W3C's objective is to evolve HTML rather than recreate it. They are trying to avoid reinvention. They are also trying to build on real-world use and test cases.

HTML5 Syntax:

There is no documentation type definition (DTD). So there is no special declaration for version 5.

HTML syntax is still served using text/html. However, you can use HTML5 with XML syntax and XHTML 1.0.

Of course, when we implement in XML notation we must close our tags the proper way (of course, we should always be rigorous and write well-formed HTML as well as XML).

The W3C will soon support SVG universally (they're waiting on -- shocker -- IE to build support into IE9).

There are all new elements that Molly runs through with us. You should read her slides to get the full list of new elements and definitions. She said she will soon post them on the Philly ETE site.

'Input' is a particularly interesting new element. It allows us to to declare and execute embedded scripts in a more terse, expressive manner without all the clunkiness of the way we declare them now.

'Require' is another great new element. It allows us validate form input without any using any scripts! That's pretty darn cool.

Embedded media is also coming. Such as canvas (the HTML5 drawing API), video, and audio (embedded audio and video without the reliance on plug-ins). These are works in progress. Even further down the road (but still realistic future features) are localStorage (client-side data storage that is persistent across sessions and uses client-side SQL storage) and applicationStorage (enabling offline applications).

One of the best places, believe it or not, to find information about web standards and HTML5 is Wikipedia. So feel free to check it out there or, of course from the horse's mouth at www.w3.org

Philly ETE, Day 2: Session 2

Hardwired for deception means trouble with estimates!

-- Linda Rising

My first experience of Linda Rising was through her fabulous book, "Fearless Change." That is the most inspiring book I've ever read about organizational change and is packed with strategies and patterns for doing so. Ariel and I continue to use the language developed in the book to talk about our own work in affecting change on our client.

She's here today to give us "some bad news" and tell us "some bad things about" us. Much of the information she's going to present will be too difficult for us to absorb and the natural reaction will be to "explain it away."

I'm strangely intrigued.

Linda's definition of deception: "consciously or unconsciously leading another or yourself to believe something that is not true." We'll start here. Her claim is that we naturally deceive ourselves or others - constantly. So it's a natural extension of this to conclude that we are constantly screwing up our estimates.

We are hardwired to deceive. We are hardwired to be optimistic, to see what we want to see. Then we rationalize away any incongruent results between our optimistic choice and reality. We are all biased, prejudiced, etc... There is no way around this. According to Linda, we cannot do anything about it.

Along with the confidence in your own intelligence comes the illusion of rationality. But you are not rational and you are certainly not more or less rational than anyone more or less intelligent than yourself. In fact, the smarter you are, the more you deceive yourself and the more clever your rationalizations become.

One example goes back to a scientific discovery made in 1847 that doctors who washed their hands between performing autopsies and delivering babies saved more lives of women an infants. But this discovery did not affect many doctors' decisions to wash their hands. They continued their non-hand-washing practices even in the face of scientific evidence.

"A new scientific truth does not triumph by convincing its opponents and making them see the light, but rather because its opponents eventually die, and a new generation grows up that is familiar with it." -- Max Planck. Wow, that is grim.

Back to the world of everyday work. How about hiring? In an experiment, a number of interviewers were given the same information for a job candidate. Some interviewers were given some 'positive' information about the candidate first, and the others were given somewhat 'negative' information first. Those interviewers who received positive information first about a candidate gave significantly higher predictions of success. In fact, this tendency was exaggerated when the element of time pressure is added.

More stats on deception: On average, during a typical conversation there are 3 lies every 10 minutes. A survey of college professors revealed that 93% believed they were above average. And so on.

OK, this presentation is really about the real world. The practice of deception is passed onto our children. We teach them to deceive in a socially acceptable manner (i.e., pretend to like every present you get from family members on Christmas). We eat more from larger containers, when served larger portions, or when at an all-you-can-eat buffet. Yet we deceive ourselves by underestimating how much we eat. Sedentary people (such as programmers) show an increase in the hormone ghrelin, which increases appetite. So internally, even your own body deceives you into thinking you need to eat more than you actually do.

Getting back to our software story estimates, we can now draw conclusions about our (in)ability to estimate. Linda used to believe that mathematical models and tons of data from past projects would point the way to better estimates. But she saw that she was wrong. In her experience she was never able to make better estimates based on any of her tool, models, or truckload of data.

So should we even bother? Should we build in some mechanism for tuning our estimates to a less optimistic number? Linda has one suggestion: stop using numbers. Estimates represented by numbers are inherently deceptive. They give the illusion of being something that can be added, subtracted and multiplied. But estimates do not have these properties by nature. So getting away from numbers will alleviate some of the tendency to deceive oneself and others into confusing an estimate with a calculation. "Use t-shirt sizes or gummy bear colors," she says. Anything but numbers.

Linda's goal is to get us thinking about this and for me she has succeeded. And there has to be more than just estimating that is getting colored by our constant deception of ourselves and others - interviewing practices (as noted above), team dynamics, and decisions made during pairing are a few that spring to mind.

Crap! I just realized Cyrus is giving out estimating decks at our booth. They have numbers all over them! I must stop them! I'm off!

Philly ETE, Day 2: Session 1

I Think I Finally Understand Mocks

-- Brian Marick

When it comes to testing, there are a few people whose advice I seek out more often than others. Brian Marick is one of them (Martin Fowler and Steve Freeman are two others I can think of off the top of my head). So I'm excited to see what Brian has to say about Mocks.

"Object-oriented design is teaching objects what to say to each other." Nicely put.

Brian has a long and acrimonious history with mocks. He claims he finally got it... about six months ago.

"Mocks are about faster design and smoother pacing" relative to normal, everyday TDD.

Brian picks up his first example at the point that an AJAX request hits the server in a Sinatra application. The request will hit a JSON controller, which will go through several routines and then spit out some well-formed JSON over the wire. He uses Shoulda to test the JSON controller. He mocks out each object that the controller must interact with. Brian uses FlexMock (written by Jim Weirich) to mock out his objects. I wonder how this differs from Mocha. Should investigate this a little further.

After showing us the tests, he points out he can write code to pass the test without worrying about the creation or test-driving the objects that the controller utilizes. The idea here is that contrary to many people's usual one-way drive through TDD, mocking out the controller's objects first gives us the choice of what to create next.

As an aside, he's using something called Prezi Desktop, which I've never seen in action before. It seems like a cool, digitalized way of whiteboarding design.

For example, Brian next picks out the internalizer object. He test-drives that object, and then iteratively makes the controller actually "work" in the real world. The big win here is that as the internal functionality of the objects changes, your controller tests won't have to. The mock test simply ensures that the contract between the controller and the objects it interacts with stays broken. Checking out his slides (which will eventually be posted here.) will do better justice to this part of the talk than I can in prose. But I'll keep trying.

One interesting discovery Brian made was that when he finally felt comfortable with the way he was using mocks, he went back and replaced some of his tests with mock tests and was amazed at how much setup code and other cruft just "went away." I could definitely get down with that.

According to Brian: "Mocks are not about the final structure of the application. They are about the process by which you arrive at the final structure of the application." That seemed worth writing verbatim.

Faster Design: As the functionality of one of your controller's dependent objects grows, the size of the file grows too big. Probably the right thing to do here is to fork off another object, and continue to keep doing this (no shocker here: objects become too big or do too many things and must then be split into other objects). What Brian claims is that mocking "at the top" makes the construction of loosely coupled objects easier to design test-first.

Better Pacing: Again, this comes down to choice. Mock all of the objects that a controller must interact with on the controller test, and then defer the decision of which objects to construct, decompose, or change in any other way to the last responsible moment. Very cool idea. And very well illustrated in one particular slide. I need to try some of this out soon.

Mocks encourage you to start at the very top. Start at the view, or the controller, and drill down to the objects, the persistence layer, etc...

Brian points out the dirty little secret of mocks: in all likelihood it will force you to rewrite tests more than you are used to. But, he says, the rewriting is easier and even starts to feel like a natural part of the TDD flow. A bold statement, and one I'm willing to put to the test in my own work.

Thursday, April 8, 2010

Philly ETE: Session 5

Clojure's Approach to State and Identity

-- Rich Hickey

Rich starts by mentioning that this is not a Clojure-specific talk until the very end. What we'll have is, in essence, a philosophical talk about programming practices followed by a practical application of these in Clojure

"Pure" functions: depend only on their arguments, given the same arguments, always returns the same value, has no effect on the rest of the world, has no notion of time. Rich wants to reduce the definition of a function to what is essentially a mathematical one.

To the extent that you write functions as he just described, you will write better programs - easier to test, safer, cleaner. But in large respect, most programs are not functions and do not consist of a majority of functions. For the purposes of this talk, Rich defines anything that is not a pure function as a process.

A process may include the notion of time. They may have effects on the outside world, might produce different answers at different times even given the same arguments (for example, a search engine).

"Variables in traditional languages are predicated on a single thread of control, one timeline." Of course, as soon as you add concurrency you introduce problems. Your variables are no longer atomic, they may require locks, they may require workarounds to deal with the problem of time.

It behooves each language to have a model for Time. Something must be built into the language to check if one event happens before, after, or at the same time as some other event. In fact, relativity is what we are really after with respect with time.

I'm wavering here. Last session and I'm getting a little bombarded by the philosophical discussion. I needed to hear this earlier in the day. Good thing I'm going to the Clojure Pragmatic Studio and learn more!

Example: Race-walker foul detector. If left and right foot are off the ground at the same time, that is a foul. This is another demonstration of how time must be taken into account. We need a snapshot of the world at a particular time in order to make this decision. We also can't solve this problem with locking (in this real world example, this would mean stopping the runner to inspect his foot position). If we had a value that included time, I could solve this problem.

Core to Programming Approach

- Programming with values is critical

- manage the succession of values (states) by eschewing changing values in place

Using Persistent Data Structures

- Immutable composite values are the key. Change is thereby merely another function that takes one value and returns another. The value is immutable so we don't have to worry about change outside the world of the function.

- The collection maintains its performance. It does not degrade as the program is developed further.

Rich has a way of instilling confidence in you, the developer, when implementing functional programming languages, particularly his own. His message: "Don't worry about threads, don't worry about state, don't worry about concurrency gotchas. I'm worrying about all of that for you so that you don't have to." That makes me feel nice, and like I need to spend more time exploring Clojure than I have already.

Philly ETE: Session 4

Opinionated is Relative: choice and modularity in Rails applications

-- David A. Black

Rails choices seems like a new or paradoxical concept. We've heard forever that Rails is opinionated software and "forces" you into design choices.

But David sees this paradigm shifting. He attributes the origins of this shift to Josh Susser and his blog post, "The Tyranny of Choice."

In fact, David sees signs of these choices even in the earliest versions of Rails. He is even writing a book about those choices called, appropriately, "Rails Choices" (Pragmatic Bookshelf, forthcoming). And this is where he will focus this particular talk.

Rails 3 is more modular than any of its predecessors. Modularity is slightly different than choice in that modularity has a bigger effect on architecture. So we'll first go through some everyday choices and then we will dip into modularity, specifically in respect to "choosing" a DB layer other than ActiveRecord.

To complete the transparency of his book, this talk, and the motives behind it, David goes on to describe his goals:

-- avoid "this sux/this rulez" - just deliver factual information so that the attendees can make up their own minds.

-- Factor in organizational realities (if he didn't before, he now has his fair share after spending three months on a client site as a Cyrus developer). Who gets to see what? Who gets to change what? Are some decisions already made (like in any legacy application)? In this case we need to evaluate the costs vs. the benefits of change. Choice, then, is loaded in a way that it is not in a brand new project.

-- Provide a good number of techniques. This is not an in-depth how-to. But the book should touch on things the reader isn't intimately familiar with. For example, and in-depth discussion of ORM validations vs. DB constraints. David also mentioned Liquid architecture as an alternative.

-- Stick close to the framework. This is not a rundown on every plugin or gem. It's not a how-to on adding these programs to your framework. However, some plugins get 'privileged' treatment such as RSpec, HAML, and Liquid.

In Rails 3, many things that used to just come wrapped up with the package are now installed via plugins. David also wants to open the subject of "supplemental code" - add-ons, overrides, project-specific libraries, etc... I like this coverage. For the longest time on my first Rails project I just stuck everything that wasn't an M, V, or C into the lib directory. Things got messy in a hurry, as you might expect. I'd like to learn more about this. Application organization best practices.

An aside: David HATES scaffolding. As he sees it, scaffolding obscures the real work that must be done to set up your first piece of functionality. One should be able to get through the preliminaries without being tied into scaffolding.

David has set up a site called http://anecdotes.rubypal.com/. This is supposed to augment or supplement his new book. From the site itself:

What's the "anecdote" thing?

I'd like to include some stories from people who've developed Rails applications and have real-life experience making Rails choices. Each anecdote will be about 1/3 to 1/2 a page long.

Why did you choose RSpec over TestUnit? Are you using HAML or Liquid instead of RHTML? Does your team use migrations, or does your DBA do it all?

And why? That's what I'm interested in: why you made a particular choice, in the context of your project and your team.

If your anecdote is chosen for inclusion in the book, you'll get a free copy of it when it's published.

Act 2 of David's Talk: Choice and Modularity

The demo will be a replacement of ActiveRecord with GDBM in Rails 3. Live coding here. (In GDBM, for each record, you have a file.) David shows us a Rails application console where a model acts exactly the same way it would were it interacting with ActiveRecord. Only it's not AR; it's GDBM under the hood. In showing how the replacement was implemented, David points out the freedom he has to write his own class methods from which his models extend. He then shows us the abstract_oxm.rb file that is responsible for "gluing" the object mapping layer to ActiveModel. This is the part where Rails 3 becomes very modular. You really have the choice to utilize one configuration file to map the rest of Rails to whatever database mapping layer you want. There are a few methods that "have" to be implemented in order for ActiveModel to play nice. The rest is up to you. Very cool!

David puts sums up succinctly. To paraphrase: there is a very small contract to which you must adhere in order to implement your own choice of persistence layer. And this is not limited to database layers. Templating and the ability to more quickly and easily work with Rack are other examples of where Rails 3 provides modularity.