0% found this document useful (0 votes)
36K views49 pages

Engineering Large Projects in A Functional Language

Galois has been building software systems in Haskell for the past decade. This talk describes some of what we’ve learned about in-the-large, commercial Haskell programming in that time. I'll look at when and where we use Haskell. At correctness, productivity, scalabilty, maintainability, and what language features we like: types, purity, types, abstractions, types, concurrency, types! We'll also look at the Haskell toolchain: FFI, HPC, Cabal, compiler, libraries, build systems, etc, and being a commercial entity in a largely open source community.

Uploaded by

Don Stewart
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36K views49 pages

Engineering Large Projects in A Functional Language

Galois has been building software systems in Haskell for the past decade. This talk describes some of what we’ve learned about in-the-large, commercial Haskell programming in that time. I'll look at when and where we use Haskell. At correctness, productivity, scalabilty, maintainability, and what language features we like: types, purity, types, abstractions, types, concurrency, types! We'll also look at the Haskell toolchain: FFI, HPC, Cabal, compiler, libraries, build systems, etc, and being a commercial entity in a largely open source community.

Uploaded by

Don Stewart
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 49

Engineering Large Projects

in a Functional Language
Lessons from a Decade of Haskell at Galois
Don Stewart | 2010-07-10 | DevNation PDX
This talk made possible by...
 Aaron Tomb  Joel Stanley
 Adam Wick  John Launchbury

 Andy Adams-Moran  John Matthews

  Jonathan Daugherty
Andy Gill
 Josh Hoyt
 David Burke
 Laura McKinney
 Dylan McNamee
 Ledah Casburn
 Eric Mertens
 Lee Pike
 Iavor Diatchki
 Levent Erkok
 Isaac Potoczny-Jones
 Louis Testa
 Jef Bell
 Magnus Carlsson
 Peter White
 Matt Sottile
 Trevor Elliott
 Paul Heinlein
 Phil Weaver  Rogan Creswick
 Jason Dagit  Sally Browning
 Jeff Lewis  Sigbjorn Finne
 Joe Hurd  Thomas Nordin
 Brett Letner
© 2010 Galois, Inc. All rights reserved.  … and many others
What does Galois do?

 Information assurance for critical systems


 Building systems that are trustworthy and secure
 Mixture of government and industry clients
 R&D with our favorite tools:
• Formal methods
• Typed functional languages
• Languages, compilers, DSLs
 Kernels, file systems, networks, servers, compilers,
security, desktop apps, ...
 Haskell for pretty much everything
© 2010 Galois, Inc. All rights reserved.
Haskell is ...

 A purely functional language


 Strongly statically typed
 20 years old
 Open source http://haskell.org
http://haskell.org/platform
 Compiled and interpreted http://hackage.haskell.org
 Used in research, open source and industry

© 2010 Galois, Inc. All rights reserved.


Yes. Haskell can do that.

 Many 20 – 200k LOC Haskell projects


 Oldest commercial projects over 10 years of
development now (e.g. Cryptol)
 Teams of 1 – 6 developers at a time
 Much pair programming, whiteboards, code reviews
 20 – 30 devs over longer project lifetime
 Have built many tools and libraries to support
Haskell development on this scale
 Haskell essential to keeping clients happy with:
• Deadlines, performance(!), maintainability
© 2010 Galois, Inc. All rights reserved.
Themes

© 2010 Galois, Inc. All rights reserved.


Languages matter!

 Writing correct software is difficult!


 Programming languages vary wildly in how well they
support robust, secure, safe coding practices
 Languages and tools can aid or hinder our efforts:
• Type systems
• Purity
• Modularity / compositionality
• Abstraction support
• Tools: analyses, provers, model checking
• Buggy implementations
© 2010 Galois, Inc. All rights reserved.
Detect errors early!

 Detecting problems before executing the program is


critical
• Debugging is hard
• Debugging low level systems is harder
• Debugging low level critical systems is ...
 Culture of error prevention
• “How could we rule out this class of errors?”
• “How could we be more precise?”

© 2010 Galois, Inc. All rights reserved.


The toolchain matters!

 Can't build anything without a good tool chain


• Native code, optimizing compiler
• Libraries, libraries, libraries
• Debugging, tracing
• Profiling, inspection, runtime analysis
• Testing, analysis
• Need open, modifiable tools
– Particularly when pushing the boundaries
(Haskell on bare metal..)
© 2010 Galois, Inc. All rights reserved.
Community matters!

 Soup of ideas in a large, open research community:


• Rapid adoption of new ideas
 Support, maintainance and help
• Can't build everything we need in-house!
 Give back via:
• Workshops: CUFP, ICFP, Haskell Symposium
• Hackathons
• Industrial Haskell Group
• Open source code and infrastructure
• Teaching: papers, blogs, talks
© 2010 Galois, Inc. All rights reserved.
How Galois Uses Haskell

© 2010 Galois, Inc. All rights reserved.


1. The Type System

© 2010 Galois, Inc. All rights reserved.


© 2010 Galois, Inc. All rights reserved.
Types make our lives easier

 Cheap way to verify properties


• Cheaper than theorem proving
• More assurance than testing
• Saves debugging in hostile environments
 Typical conversation:
• Engineer A: “Spec says this must never happen”
• Engineer B: “Can we enforce that in the type system?”

© 2010 Galois, Inc. All rights reserved.


Kinds of things types enforce

 Simple things:
• Correct arguments to a function
• Function f does not touch the disk
• No null pointers
• Mixing up similar concepts:
– Virtual / physical addresses
 Serious things:
• Information flow policies
• Correct component wiring and integration

© 2010 Galois, Inc. All rights reserved.


Recent experience
First demo of a new system
 Six engineers
 50k lines of code, in 5 components, developed over a
number of months
 Integrated, tested, demo'd in only a week, two months
ahead of schedule, significantly above performance
spec.
 1 space leak, spotted and fixed on first day of testing via
the heap profiler
 2 bugs found (typos from spec)

© 2010 Galois, Inc. All rights reserved.


Purity is fundamental

 Difficult to show safety without purity


 Code should be pure by default
 Makes large systems easier to glue:
• Pure code is “safe” by default to call
 Effects are “code smells”, and have to be treated
carefully
 The world has too many impure languages: don't add to
that

© 2010 Galois, Inc. All rights reserved.


Types aren't enough though

 Still not expressive enough for a lot of the properties we


want to enforce

 We care a lot about sizes in types


• “Input must only be 128, 192 or 256 bits”
• “Type T should be represented with 7 bits”

© 2010 Galois, Inc. All rights reserved.


Other tools in the bag

 Extended static analysis tools


 Model checking
• SAT, SMT, …
 Theorem proving
• Isabelle, Agda, Coq

 How much assurance do you need?

© 2010 Galois, Inc. All rights reserved.


2. Abstractions

© 2010 Galois, Inc. All rights reserved.


Monads

 Constantly rolling new monads


• Captures critical facts about the execution environment in the
type
 Directly encodes semantics we care about
• “Computed keys are not visible outside the M component”
• “Function f has read-only access to memory”

© 2010 Galois, Inc. All rights reserved.


Algebraic Data Types

 Every system is either an interpreter or a compiler


• Abstract syntax trees are ubiquitous
• Represent processes symbolically, via ADTs, then evaluate
them in a safe (monadic) context
• Precise, concise control over possible values
• But need precise representation control

© 2010 Galois, Inc. All rights reserved.


Laziness

 Captures some concepts perfectly


• “A stream of 4k packets from the wire”
 Critical for control abstractions in DSLs
 Useful for prototyping:
• error “M.F.foo: not implemented”

© 2010 Galois, Inc. All rights reserved.


Laziness

 Makes time and space reasoning harder!


• Mostly harmless in practice
• Stress testing tends to reveal retainers
• Graphical profiling knocks it dead
 Must be able to precisely enable/disable
 Be careful with exceptions and mutation
 whnf/rnf/! are your friends

© 2010 Galois, Inc. All rights reserved.


Type classes

 We use type classes


• Well defined interfaces between large components (sets of
modules)
• Natural code reuse
• Capture general concepts in a natural way
• Capture interface in a clear way
• Kick butt EDSLs (see Lennart's blog)

© 2010 Galois, Inc. All rights reserved.


Concurrency and Parallelism

 forkIO rocks
• Cheap, very fast, precise threads
 MVars rock
 STM rocks (safely composable locks!)

 Result: not shy introducing concurrency when


appropriate

© 2010 Galois, Inc. All rights reserved.


3. Foreign Function Interface

© 2010 Galois, Inc. All rights reserved.


Foreign Function Interface

 The world is a messy place


 A good FFI means we can always call someone else's
code if necessary
 Have to talk to weird bits of hardware and weird proof
systems
 ForeignPtr is great abstraction tool
 Must have clear API into the runtime system (hot topic at
the moment)

© 2010 Galois, Inc. All rights reserved.


4. Meta programming

© 2010 Galois, Inc. All rights reserved.


There's alway boilerplate

 Abstractions get rid of a lot of repetitive code, but there's


always something that's not automated
 We use a little Template Haskell
 Other generics:
• Hinze-style generics
• SYB generics
 Particular useful for generating instance code for
marshalling

© 2010 Galois, Inc. All rights reserved.


5. Performance

© 2010 Galois, Inc. All rights reserved.


Fast enough for majority of things

 Vast majority of code is fast enough


• GHC -O2 -funbox-strict-fields
• Happy with 1 – 2x C for low level code
 Last few drops get squeezed out:
• Profiling
• Low level Haskell
• Cycle-level measurement
• EDSLs to generate better code
• Calling into C

© 2010 Galois, Inc. All rights reserved.


Performance

 Really precise performance requires expertise


 Libraries are helping reify “oral traditions” about
optimization
 Still a lack of clarity about performance techniques in the
broader Haskell community though

© 2010 Galois, Inc. All rights reserved.


6. Debugging

© 2010 Galois, Inc. All rights reserved.


There are still bugs!

 Testing
• QuickCheck!!!
 Heap profiling
• “By type” profiling of the heap
 GHC -fhpc
• Great for finding exceptions
• Understanding what is executing
 +RTS -stderr
• Explain what GC, threads, memory is up to

© 2010 Galois, Inc. All rights reserved.


7. Documentation

© 2010 Galois, Inc. All rights reserved.


Generating supporting artifacts

 Haddock is great for reference material


• Helps capture design in the source
• Code + types becomes self documenting
 Design documents can be partially extracted via:
• The major data and type signatures
• graphmod
• cabalgraph
• HPC analysis

© 2010 Galois, Inc. All rights reserved.


8. Libraries

© 2010 Galois, Inc. All rights reserved.


Hackage Changed Everything

 2200+ libraries created in 3 years. There's a library for


everything, and often more than one...
 Can sit back and let mtl / monadlib / haxml / hxt fight it
out :)
 Static linking → need BSD licensed code if we want to
ship
 Haskell Platform to answer QA questions

© 2010 Galois, Inc. All rights reserved.


9. Shipping code

© 2010 Galois, Inc. All rights reserved.


Cabal

 I don't know how Haskell was possible before Cabal :)


 Quickly adopted Cabal/cabal-install across projects
 cabal-install:
• Simple, clean integration of internal and external components
into packageable objects

© 2010 Galois, Inc. All rights reserved.


10. Conventions

© 2010 Galois, Inc. All rights reserved.


We try to ...

 -Wall police
 Consistent layout
 No tabs
 Import qualified Control.Exception
 {-# LANGUAGE … #-}
 Map exceptions into Either / Maybe

© 2010 Galois, Inc. All rights reserved.


We try to ...

 deriving Show
 Line/column for errors if you must throw
 No global mutable state
 Put type sigs in “when you're done” with the design
 Use GHCi for rapid experimentation
 Cabal by default.
 Libraries by default

© 2010 Galois, Inc. All rights reserved.


11. Training

© 2010 Galois, Inc. All rights reserved.


Easy to find Haskell programmers

 With a big open source community, its much easier to


find Haskell programmers now
 Many more applicants than jobs, often with significant
experience from open source
 We train on-site, and new resources like LYAH and
RWH make this easier.

© 2010 Galois, Inc. All rights reserved.


12. Things that we still need

© 2010 Galois, Inc. All rights reserved.


More support for large scale programming

 Enforcing conventions across the code


 Data representation precision (emerging)
 A serious refactoring tool (HaRe on Hackage!)
 Vetted and audited libraries by experts (Haskell Platform
)
 Idioms for mapping design onto
types/functions/classes/monads
 Better capture your 100 module design!

© 2010 Galois, Inc. All rights reserved.


© 2010 Galois, Inc. All rights reserved.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy