EDA{N/F}40: Notes for talk about Haskell tooling

This is a summary of what we did during the talk, the slides are here. You can find an 'empty' stack project set up for exercise 1 here.

Editing Haskell code

We started out with the contentious question of whether to use spaces or tabs for indentation in our source code. For a language like Haskell (or Python), where the semantics of our program is determined by the indentation, using spaces is really the only reasonable choice (and if we care about precision, using spaces is the only way to make sure our code looks the same in any editor, no matter what language we use -- so, in this case, Richard Hendricks doesn't know what he's talking about).

We then briefly touched on the subject of editors -- the editor is the tool you'll be spending most of your time in, by far, but most 'programming editors' (Emacs, vi/vim, Sublime, Atom, ...) have great Haskell support, and some IDE's (VS Code, Intellij, Eclipse) have extensions for Haskell, so you'll probably be fine with what you're using now. I showed a graph from the 2017 state of Haskell survey, showing that vim and Emacs are the most popular Haskell editors today.

The compiler, ghc, and the REPL, ghci

We then turned to the Haskell compiler, ghc, and the interpreter, ghci. First we compiled the ubiquitous 'hello, world!'-program:

$ ghc --version
The Glorious Glasgow Haskell Compilation System, version 8.0.2
$ cat > hello.hs
main = putStrLn "hello, world!"
$ ghc hello.hs
[1 of 1] Compiling Main             ( hello.hs, hello.o )
Linking hello ...
$ ls -l
total 2192
-rwxr-xr-x 1 cso cso 2231064 mar 28 10:26 hello
-rw-r--r-- 1 cso cso     753 mar 28 10:26 hello.hi
-rw-r--r-- 1 cso cso      32 mar 28 10:26 hello.hs
-rw-r--r-- 1 cso cso    3360 mar 28 10:26 hello.o
$ ./hello
hello, world!
$

Then we tried the Read-Evaluate-Print-Loop (REPL):

$ ghci
GHCi, version 8.0.2: http://www.haskell.org/ghc/  :? for help
Loaded GHCi configuration from /home/cso/.ghci
λ: :l hello.hs
[1 of 1] Compiling Main             ( hello.hs, interpreted )
Ok, modules loaded: Main.
λ: main
hello, world!

We added a function for calculating factorials to our hello-world-program, and reloaded and tested it.

ghci has some useful built in commands, such as:

ghci can also be configured in your ~/.ghci-file, as an example, adding:

let ghciEscapeShellArg arg = "'" ++ concatMap (\x -> if x == '\'' then "'\"'\"'" else [x]) arg ++ "'"
:set prompt "λ: "
:def! hlint return . (":! hlint \"./\"" ++) . ghciEscapeShellArg
:def! hoogle return . (":! hoogle --color --count=20 " ++) . ghciEscapeShellArg
:def! doc return . (":! hoogle --color --info " ++) . ghciEscapeShellArg

gives us access to three additional commands:

To run these commands, you have to install hoogle and hlint, in the section about stack below you'll see how to do that.

λ: :l hello.hs
[1 of 1] Compiling Main             ( hello.hs, interpreted )
Ok, modules loaded: Main.
λ: :hlint
No suggestions
λ: :info map
map :: (a -> b) -> [a] -> [b]   -- Defined in ‘GHC.Base’
λ: :hoogle (a -> Bool) -> [a] -> [b]
Prelude filter :: (a -> Bool) -> [a] -> [a]
Prelude takeWhile :: (a -> Bool) -> [a] -> [a]
Prelude dropWhile :: (a -> Bool) -> [a] -> [a]
Data.List takeWhile :: (a -> Bool) -> [a] -> [a]
Data.List dropWhile :: (a -> Bool) -> [a] -> [a]
Data.List dropWhileEnd :: (a -> Bool) -> [a] -> [a]
Data.List filter :: (a -> Bool) -> [a] -> [a]
...
λ: ...

Cabal

To build interesting programs, we'll need to import libraries, and since 2005, cabal has been used to simplify packaging of Haskell libraries -- it is several things at once, amongst them:

A .cabal file describes what dependencies our project has, we'll look a bit more at .cabal files below.

But although cabal is a great piece of software, it unfortunately suffers from some problems -- if you're interested, they are outlined here. The most basic problem is that cabal doesn't give us repeatable builds, which is somewhat ironic in a functional programming context.

Stack, a tool for building Haskell projects

In 2015, stack was released to solve the problems users had with cabal -- it is built on top of cabal, but uses curated collections of packages, bundled with a version of the ghc compiler.

We can use stack to install various tools, such as hoogle, hlist, doctest, xmonad, etc. To install hoogle, we write:

$ stack install hoogle

To have someting to work with during the lecture, we chose to write a simple library for keeping track of duplicate values in lists -- we wanted the two functions:

hasDups :: (Eq a) => [a] -> Bool
removeDups :: (Eq a) => [a] -> [a]

and we started out by creating a stack project named dups (the $ below is just a shell prompt, not something you should write yourself...):

$ stack new dups

This created a directory dups, with the following files (you can also use templates when you create stack projects, we used the default, which is what you see below):

$ tree dups
dups
├── app
│   └── Main.hs
├── ChangeLog.md
├── dups.cabal
├── LICENSE
├── package.yaml
├── README.md
├── Setup.hs
├── src
│   └── Lib.hs
├── stack.yaml
└── test
    └── Spec.hs

3 directories, 10 files

This is similar to what you would get if you used mix to create an Elixir project, sbt to create a Scala project, or lein to create a Clojure project.

The source code is spread out into three directories:

We also have two files with project configuration:

To get ready to develop our own library, we had to make a few changes:

Then it was time to build our project for the first time:

$ stack setup
$ stack build

Everything stack produces is put in a directory .stack-work/, our main file could be found in:

.stack-work/dist/x86_64-linux-nopie/Cabal-2.0.1.0/build/dups-exe/dups-exe

It is an executable binary, and we can run it by just giving its filename, but we can also run it using stack:

$ stack exec dups-exe

(i.e., the project name (dups) followed by -exe).

Testing

We talked a little bit about testing in general, and unit tests specifically, and then tried to use "Test First" to write our library -- we began by importing Test.Tasty and Test.Tasty.HUnit, and then wrote the simplest possible unit test for hasDups in test/Spec.hs (using Strings as lists is often convenient):

import Test.Tasty
import Test.Tasty.HUnit

import Dups

hasDupsTests = testGroup "Unit tests for hasDups"
  [ testCase "empty list" $ hasDups "" @?= False
  ]

main = defaultMain hasDupsTests

To import Tasty and Tasty.HUnit, we need to add them to our package.yaml file:

tests:
  dups-test:
    main:                Spec.hs
    source-dirs:         test
    ghc-options:
    - -threaded
    - -rtsopts
    - -with-rtsopts=-N
    dependencies:
    - dups
    - tasty
    - tasty-hunit

When then ran

$ stack test

and got a compilation error -- we haven't even declared hasDups yet.

So we wrote some code in src/Dups.hs:

module Dups
    ( hasDups
    ) where

hasDups :: Eq a => [a] -> Bool
hasDups = undefined

This time the code compiles, but the test fails because we haven't defined hasDups.

This is a way of coding which is recommended by some people, you're not allowed to write any business code until you have tests which require it -- it may not be a panacea for all programming problems, but it could be worth trying out.

So, we started by implementing code for handling empty lists (and nothing more, since we haven't written any tests for it yet):

module Dups
    ( hasDups
    ) where

hasDups :: Eq a => [a] -> Bool
hasDups [] = False
hasDups _  = undefined

And this made the first test pass (we ran stack test).

We now added the next test:

hasDupsTests = testGroup "Unit tests for hasDups"
  [ testCase "empty list" $ hasDups "" @?= False
  , testCase "list with single element" $ hasDups "a" @?= False
  ]

And this failed since our current hasDups doesn't handle non-empty lists -- we therefore added code to handle the second test case above:

module Dups
    ( hasDups
    ) where

hasDups :: Eq a => [a] -> Bool
hasDups []      = False
hasDups (x:xs)  = x `elem` xs

This worked, so we added some new tests:

hasDupsTests = testGroup "Unit tests for hasDups"
  [ testCase "empty list" $ hasDups "" @?= False
  , testCase "list with single element" $ hasDups "a" @?= False
  , testCase "list with two different values" $ hasDups "ab" @?= False
  , testCase "list with two different duplicate values" $ hasDups "aa" @?= True
  , testCase "list with three different values" $ hasDups "abc" @?= False
  , testCase "list with three values, with duplicates" $ hasDups "abb" @?= True
  ]

The last of these tests failed, so we had to make sure hasDups looks for duplicates in the tail of the list:

module Dups
    ( hasDups
    ) where

hasDups :: Eq a => [a] -> Bool
hasDups []      = False
hasDups (x:xs)  = x `elem` xs || hasDups xs

We eventually ended up with (for a while we also had an infinite list in there somewhere, but I removed it):

hasDupsTests = testGroup "Unit tests for hasDups"
  [ testCase "empty list" $ hasDups "" @?= False
  , testCase "list with single element" $ hasDups "a" @?= False
  , testCase "list with two different values" $ hasDups "ab" @?= False
  , testCase "list with two different duplicate values" $ hasDups "aa" @?= True
  , testCase "list with three different values" $ hasDups "abc" @?= False
  , testCase "list with three values, with duplicates" $ hasDups "abb" @?= True
  , testCase "longer list without duplicates" $ hasDups "abcdefghijklmn" @?= False
  , testCase "longer list with duplicates" $ hasDups "abcdefghijklcmn" @?= True
  ]

We used the same procedure to implement removeDups, we ended up with the following unit tests:

removeDupsTests = testGroup "Unit tests for removeDups"
  [ testCase "empty list" $ removeDups "" @?= ""
  , testCase "single element list" $ removeDups "a" @?= "a"
  , testCase "two element list with the same value" $ removeDups "aa" @?= "a"
  , testCase "two element list with different values" $ removeDups "ab" @?= "ab"
  , testCase "longer list with different values" $ removeDups "abcdefgh" @?= "abcdefgh"
  , testCase "longer list with duplicate values" $ removeDups "abcdefegbah" @?= "abcdefgh"
  ]

unitTests = testGroup "All unit tests"
  [ hasDupsTests
  , removeDupsTests
  ]

and changed main to:

main = defaultMain unitTests

In src/Dups.hs we ended up with:

module Dups
    ( hasDups
    , removeDups
    ) where

hasDups :: Eq a => [a] -> Bool
hasDups []      = False
hasDups (x:xs)  = x `elem` xs || hasDups xs

removeDups :: Eq a => [a] -> [a]
removeDups []      = []
removeDups (x:xs)  = x : (removeDups [e | e <- xs, e /= x])

We then talked a bit about testing properties of the code -- one property which was suggested was that the list shouldn't be longer after we've removed duplicates, and this can be expressed as:

notLongerAfterRemove :: [Int] -> Bool
notLongerAfterRemove list = length list >= length (removeDups list)

The QuickCheck library has a function testProperty which essentially takes a function someProp of type t -> Bool, and tries to find a 't' value which makes it fail, i.e., return False. This works for most standard types 't', such as primitive values, lists, and tuples, and we can use it to check the property above:

propertyTests = testGroup "Property tests"
  [ testProperty "list not longer after removeDups" notLongerAfterRemove
  ]

To do this we first had to add a new import:

import Test.Tasty.QuickCheck

and also add a new dependence in package.yaml:

tests:
  dups-test:
    main:                Spec.hs
    source-dirs:         test
    ghc-options:
    - -threaded
    - -rtsopts
    - -with-rtsopts=-N
    dependencies:
    - dups
    - tasty
    - tasty-hunit
    - tasty-quickcheck

We now created a new group of tests which covered all unit tests and our property tests, and used them in the main program:

allTests = testGroup "All tests"
  [ unitTests
  , propertyTests
  ]

main :: IO ()
main = defaultMain allTests

We added a few more property test

propertyTests = testGroup "Property tests"
  [ testProperty "list not longer after removeDups" notLongerAfterRemove
  , testProperty "no duplicates after removeDups" noDupsAfterRemoveDups
  , testProperty "first element same after removeDups" $
    \list -> not (null list) ==> firstElementSameAfterRemoveDups list
  , testProperty "same as nub" sameAsNub
  ]

notLongerAfterRemove :: [Int] -> Bool
notLongerAfterRemove list = length list >= length (removeDups list)

noDupsAfterRemoveDups :: [Int] -> Bool
noDupsAfterRemoveDups list = hasDups (removeDups list) == False

firstElementSameAfterRemoveDups :: [Int] -> Bool
firstElementSameAfterRemoveDups list = head list == head (removeDups list)

sameAsNub :: [Int] -> Bool
sameAsNub list = removeDups list == nub list

You'll soon learn to write some of this code more elegantly, but this will suffice for now.

The tests above are just what we came up with during the talk, if this was a real project, we would give them much more consideration.

Documentation

It's very easy to generate documentation for our library, we just have to write

$ stack haddock

to generate this.

We can add annotations in our code to generate more informative documentation, we tried:

-- | Some functions for finding and removing duplicates from lists.
--   Used only as a demo in EDAN40/EDAF40.

module Dups
    ( hasDups
    , removeDups
    ) where

-- | Check if a list contains duplicates.
--
-- Examples:
--
-- > hasDups "abc"
--
-- should return False
--
-- > hasDups "abca"
--
-- should return True

hasDups :: (Eq a) => [a] -> Bool
hasDups [] = False
hasDups (x:xs) = x `elem` xs || hasDups xs

-- | Remove /duplicates/ from a list, this works just as
--   'Data.List.nub' in the "Data.List" package.
--
-- Example:
--
-- > removeDups "abracadabra"
--
-- should return "abrcd"

removeDups :: (Eq a) => [a] -> [a]
removeDups []     = []
removeDups (x:xs) = x : (removeDups [e | e <- xs, e /= x])

and got this.

We can also use a tool called doctest (the idea is taken from Python's doctest-tool), which lets us write examples in our code, and have them verified:

-- | Check if a list contains duplicates.
--
-- Examples:
--
-- >>> hasDups "abc"
-- False
-- >>> hasDups "abca"
-- True

hasDups :: (Eq a) => [a] -> Bool
hasDups [] = False
hasDups (x:xs) = x `elem` xs || hasDups xs

-- | Remove /duplicates/ from a list, this works just as
--   'Data.List.nub' in the "Data.List" package.
--
-- Example:
--
-- >>> removeDups "abracadabra"
-- "abrcd"

removeDups :: (Eq a) => [a] -> [a]
removeDups []     = []
removeDups (x:xs) = x : (removeDups [e | e <- xs, e /= x])

This gives the following documentation, but we can also run:

$ stack exec doctest src/Dups.hs
Examples: 3  Tried: 3  Errors: 0  Failures: 0

So, doctest checks our code samples, and all of them passed without error or failure -- this guarantees that the examples in our documentation are correct.

Here endeth the lecture.