# Stream Data Types

I’m currently looking for a very fast streaming data type for realtime audio processing in Haskell. The data types have two objectives:

- Continuously consume and produce values
- Maintain state

I created a project that contains different implementations and a benchmark that tests their speed. In this post I will show and discuss the different implementations and show their benchmarking results. The benchmark computes the integral of a sinus curve for 1 second at a sample rate of 48 kHz. Each of the different implementation defines a `nthIntegral`

function, that is the integral of a function from 0 to `n`

with a step size of 1/48000.

\[ \int_0^1 \sin(t) dt \]

## Coroutine / Lazy

This data type looks like the state monad, except that it closures its state inside the function.

```
newtype Coroutine a b = Coroutine (a -> (b,Coroutine a b))
scan :: (s -> a -> s) -> s -> Coroutine a s
scan f = go
where
go s = Coroutine $ \a -> let s' = f s a in (s',go s')
unfold :: (s -> (b,s)) -> s -> Coroutine a b
unfold f = go
where
go s = Coroutine $ const $ let (b,s') = f s in (b,go s')
(!!) :: Coroutine () a -> Int -> a
Coroutine f !! 0 = fst $ f ()
Coroutine f !! n = snd (f ()) !! (n-1)
infixl 9 !!
(>>) :: Coroutine a b -> Coroutine b c -> Coroutine a c
Coroutine f >> Coroutine g = Coroutine $ \a ->
let (b,f') = f a
(c,g') = g b
in (c,f' >> g')
infixl 1 >>
integral :: Int -> Coroutine Double Double
integral rate = scan (\x i -> i + x * dt) 0
where
dt = 1 / fromIntegral rate
nthIntegral :: Int -> (Double -> Double) -> Int -> Double
nthIntegral rate fun n =
(unfold (\t -> (fun (t / fromIntegral rate),t+1)) 0
>> integral rate)
!! n
```

#### Numbers

```
benchmarking coroutine/lazy/integral of sinus for 1 second at 48 kHz
time 117.0 ms (116.5 ms .. 118.0 ms)
1.000 R² (1.000 R² .. 1.000 R²)
mean 117.9 ms (117.6 ms .. 118.3 ms)
std dev 471.2 μs (340.4 μs .. 601.5 μs)
variance introduced by outliers: 11% (moderately inflated)
```

The coroutine type has similarities to the one I presented in my previous posts, except that this is more concrete. The `scan`

and `unfold`

functions have the same meaning as their list counterparts. The problem with these functions is, that especially the `scan`

function produces a large space leak because the state is handled strict enough.

## Coroutine / Strict

If we think this further, we just have to add strictness annotations in the `scan`

and `unfold`

function that evaluates the state to Weak Head Normal Form (WHNF) in each iteration.

```
newtype Coroutine a b = Coroutine (a -> (b, Coroutine a b))
scan :: (s -> a -> s) -> s -> Coroutine a s
scan f = go
where
go s = Coroutine $ \a -> let s' = f s a in s' `seq` (s',go s')
unfold :: (s -> (b,s)) -> s -> Coroutine a b
unfold f = go
where
go s = Coroutine $ const $ let (b,s') = f s in s' `seq` (b,go s')
```

#### Numbers

```
benchmarking coroutine/strict/integral of sinus for 1 second at 48 kHz
time 86.22 ms (83.67 ms .. 88.77 ms)
0.999 R² (0.998 R² .. 1.000 R²)
mean 84.65 ms (83.88 ms .. 85.85 ms)
std dev 1.432 ms (648.3 μs .. 2.493 ms)
```

That gave a nice speed up, but we can do better. One contention is the closuring of state.

## Stream / Lazy

One data type that can produce values and maintain state is of course an infinite list. In this benchmark I use the Stream library from Wouter Swierstra and Bas van Dijk.

```
import Data.Stream (Stream)
import qualified Data.Stream as S
-- data Stream a = Cons a (Stream a)
integral :: Int -> Stream Double -> Stream Double
integral rate = S.scan' (\i x -> i + x * dt) 0
where
dt = 1 / fromIntegral rate
nthIntegral :: Int -> (Double -> Double) -> Int -> Double
nthIntegral rate fun n =
integral rate (S.unfold (\t -> (fun (t / fromIntegral rate),t+1)) 0)
S.!! n
```

#### Numbers

```
benchmarking stream/lazy/integral of sinus for 1 second at 48 kHz
time 10.71 ms (10.63 ms .. 10.76 ms)
1.000 R² (0.999 R² .. 1.000 R²)
mean 10.61 ms (10.52 ms .. 10.67 ms)
std dev 186.2 μs (127.0 μs .. 260.3 μs)
```

This looks pretty decent, but we can do better if we use a stream data type that is strict in the head of the stream (not the tail, this would lead to non-termination).

## Stream / Strict

```
data Stream a = Cons !a (Stream a)
scan :: (a -> b -> a) -> a -> Stream b -> Stream a
scan f a (Cons b bs) =
let a' = f a b
in Cons a' (scan f a' bs)
unfold :: (s -> (a,s)) -> s -> Stream a
unfold f s0 =
let (a,s) = f s0
in Cons a (unfold f s)
(!!) :: Stream a -> Int -> a
Cons a _ !! 0 = a
Cons _ s !! n = s !! (n-1)
infixl 9 !!
```

#### Numbers

```
benchmarking stream/strict/integral of sinus for 1 second at 48 kHz
time 6.481 ms (6.398 ms .. 6.542 ms)
0.999 R² (0.998 R² .. 0.999 R²)
mean 6.311 ms (6.272 ms .. 6.353 ms)
std dev 118.5 μs (104.2 μs .. 132.0 μs)
```

This is definitely one of the fastest data types for this kind of scenario, but I can show you a data type that performs at nearly C level.

## Signal / Strict

Now I want to present the pearl of this benchmark, the `Signal`

data type. I have to admit, I’m not really sure why this data type beats the time of the previous data type, but numbers don’t lie.

```
data Signal s a b = Signal !((a,s) -> (b,s)) !s
scan :: (s -> a -> s) -> s -> Signal s a s
scan f = Signal (\(a,s) -> let s' = f s a in (s',s'))
unfold :: (s -> (b,s)) -> s -> Signal s a b
unfold f = Signal (\(_,s) -> f s)
(!!) :: Signal s () b -> Int -> b
Signal f s !! 0 = fst $ f ((),s)
Signal f s !! n = Signal f (snd (f ((),s))) !! (n-1)
infixl 9 !!
(>>) :: Signal s a b -> Signal t b c -> Signal (s,t) a c
Signal f s0 >> Signal g t0 = Signal go (s0,t0)
where
go (a,(s,t)) =
let (b,s') = f (a,s)
(c,t') = g (b,t)
in (c,(s',t'))
infixl 1 >>
integral :: Double -> Signal Double Double Double
integral rate = scan (\x i -> i + x * dt) 0
where
dt = 1 / fromIntegral rate
nthIntegral :: Double -> (Double -> Double) -> Int -> Double
nthIntegral rate fun n =
(unfold (\t -> (fun (t / fromIntegral rate),t+1)) 0
>> integral rate)
!! n
```

#### Numbers

```
benchmarking signal/strict/integral of sinus for 1 second at 48 kHz
time 2.767 ms (2.766 ms .. 2.768 ms)
1.000 R² (1.000 R² .. 1.000 R²)
mean 2.766 ms (2.766 ms .. 2.767 ms)
std dev 1.310 μs (1.125 μs .. 1.596 μs)
```

This is as fast as you can get with Haskell, to my knowledge. To compare this numbers with the performance of C, I wrote the same benchmark in C, only without higher order functions and without composability.

## Baseline / C

```
#include <math.h>
double nthIntegral(double dt, int n)
{
double integral = 0;
int i;
for(i = 0; i < n; i++) {
double t = (double)i / dt;
double s = sin(t);
integral = integral + s * dt;
}
return integral;
}
```

#### Numbers

```
benchmarking c/baseline/integral of sinus for 1 second at 48 kHz
time 1.856 ms (1.854 ms .. 1.857 ms)
1.000 R² (1.000 R² .. 1.000 R²)
mean 1.857 ms (1.857 ms .. 1.857 ms)
std dev 279.0 ns (233.2 ns .. 346.5 ns)
```

## Conclusion

Here is again an overview of all the benchmarking results.

```
benchmarking c/baseline/integral of sinus for 1 second at 48 kHz
time 1.856 ms (1.854 ms .. 1.857 ms)
1.000 R² (1.000 R² .. 1.000 R²)
mean 1.857 ms (1.857 ms .. 1.857 ms)
std dev 279.0 ns (233.2 ns .. 346.5 ns)
benchmarking automaton/integral of sinus for 1 second at 48 kHz
time 1.430 s (1.401 s .. 1.492 s)
1.000 R² (1.000 R² .. 1.000 R²)
mean 1.432 s (1.424 s .. 1.438 s)
std dev 9.857 ms (0.0 s .. 10.97 ms)
variance introduced by outliers: 19% (moderately inflated)
benchmarking coroutine/lazy/integral of sinus for 1 second at 48 kHz
time 117.0 ms (116.5 ms .. 118.0 ms)
1.000 R² (1.000 R² .. 1.000 R²)
mean 117.9 ms (117.6 ms .. 118.3 ms)
std dev 471.2 μs (340.4 μs .. 601.5 μs)
variance introduced by outliers: 11% (moderately inflated)
benchmarking coroutine/strict/integral of sinus for 1 second at 48 kHz
time 86.22 ms (83.67 ms .. 88.77 ms)
0.999 R² (0.998 R² .. 1.000 R²)
mean 84.65 ms (83.88 ms .. 85.85 ms)
std dev 1.432 ms (648.3 μs .. 2.493 ms)
benchmarking signal/lazy/integral of sinus for 1 second at 48 kHz
time 54.15 ms (53.39 ms .. 55.22 ms)
0.999 R² (0.999 R² .. 1.000 R²)
mean 54.20 ms (53.68 ms .. 54.65 ms)
std dev 839.1 μs (577.4 μs .. 1.166 ms)
benchmarking signal/strict/integral of sinus for 1 second at 48 kHz
time 2.767 ms (2.766 ms .. 2.768 ms)
1.000 R² (1.000 R² .. 1.000 R²)
mean 2.766 ms (2.766 ms .. 2.767 ms)
std dev 1.310 μs (1.125 μs .. 1.596 μs)
benchmarking stream/lazy/integral of sinus for 1 second at 48 kHz
time 10.71 ms (10.63 ms .. 10.76 ms)
1.000 R² (0.999 R² .. 1.000 R²)
mean 10.61 ms (10.52 ms .. 10.67 ms)
std dev 186.2 μs (127.0 μs .. 260.3 μs)
benchmarking stream/strict/integral of sinus for 1 second at 48 kHz
time 6.481 ms (6.398 ms .. 6.542 ms)
0.999 R² (0.998 R² .. 0.999 R²)
mean 6.311 ms (6.272 ms .. 6.353 ms)
std dev 118.5 μs (104.2 μs .. 132.0 μs)
```

My recommendation is, if you want to work on audio processing and digital synthesis in Haskell, use the strict `Stream`

or strict `Signal`

data type I presented. The later one even can reach the performance of C very closely.

If you want to execute the benchmarks yourself or play with the source code, go to the repository on github.