Category: Functional Programming

04/29/13

Permalink 08:43:35 pm, by truewill Email , 746 words, 8242 views   English (US)
Categories: Books, Conventions, Design, Functional Programming, goto2013

GOTO notes: Implementing Riak in Erlang

Benefits and Challenges of Implementing Riak in Erlang
Steve Vinoski - http://basho.com/

Erlang
  Requirements
    Large number of concurrent activities
    Distributed systems
    Continuous operation for years
    Live updates and maintenance
    Hardware/software fault-tolerant
Concurrency - common theme
  Erlang processes are much more lightweight than OS threads
Reliability - what it gives you
  Isolation - processes communicate only by message passing
  Distribution - works across nodes
  Linking/supervision/monitoring - one process takes action when another fails
Small language; few elements; functional - relatively easy to learn
  Variables are immutable; no globals
  Flow control via pattern matching; recursion
Concurrency primitives
  Processes, not mutexes, etc.
  Selective receive lets you receive specific messages from anywhere in the message queue (even if other types are ahead of them)
OTP framework - everybody uses this

Riak
  CAP theorem
  A distributed highly available eventually consistent highly scalable open source key-value database written primarily in Erlang.
  Modeled after Amazon Dynamo. See Andy Gross’s 5 years later talk.
  Also provides MapReduce, secondary indexes, full-text search
  Built for operational ease - it just runs
  Multiple clients - .NET, Java, Node.js, etc.
  Consistent hashing - no sharding
  Stores replicas - N/R/W values - adjustable
  (The ring is very similar to the Security Now discussion of Tor.)
  Vector clock - how it determines stale value - number of operations actors performed and a timestamp
  All the nodes in a cluster are peers - no masters or slaves
  Nodes exchange their understanding of ring state via gossip protocol
    Riak uses the Erlang mesh for this
  Can simulate multi-node installment on a single machine (nice for development)
  At about 150 nodes, the cluster doesn’t scale well.

Control vs. Data
  Distributed Erlang is good for control plane, not so good for data plane
  Sending large data can block
  Use TCP, UDP, etc. directly for data plane traffic
  Don’t mix control plane and data plane traffic
    Riak still does this in a few places, unfortunately (they’re going to fix this)

Hinted Handoff
  Fallback vnode holds data for unavailable primary vnode
  Hands it off once primary becomes available

Read Repair
  Vnode with stale data is repaired via asynchronous update
  Eventual consistency
  Active anti-entropy (AAE) - can actively repair stale value before it’s read

Monitoring is great for cleaning up after aborted operations.

Pattern-matching is an elegant way to parse binary data.

gen_fsm - one of the OTP library behaviors (finite state machine)

In Erlang, everybody uses these behaviors; makes for more readable code.

Let It Crash - Joe Armstrong’s doctoral thesis (he created Erlang)

Business logic goes in Workers; Supervisors are very simple and just start and watch Workers. Little can go wrong with Supervisors.

Erlang/OTP System Facilities
  Get status of OTP process
  Trace function calls, messages
    Put a trace on a process - VERY powerful
    Seldom need a debugger
    Be careful - you can take a system down by improper tracing

Linking with C/C++
  NIF - Native Implemented Function
  Can ref count binaries
  Portable interface to OS
  Can crash the whole VM with a bad NIF
  NIF calls execute within a VM scheduler thread - can block the thread
  NIFs should only block for a millisecond or less
  Put long-running activities in their own threads.

Eunit - unit testing
QuickCheck - create model of SUT, and it runs randomly-generated tests against it (like PEX?) - shrinks the test case for easier debugging - awesome

Watch your memory.

Hot code loading really works.

Understand the Erlang VM.

A DSL for distributed systems.

A Little Riak Book, Riak Handbook

Elixir - Ruby-like language on Erlang VM - also some Lisp-style languages

O’Reilly Erlang book is also very good. (Francesco Cesarini, Simon Thompson)

(And he gave me a free book for asking a question! - Bill S.)

Permalink 08:30:20 pm, by truewill Email , 574 words, 8382 views   English (US)
Categories: Conventions, Functional Programming, goto2013

GOTO notes: Evolving Java

Evolving Java
Brian Goetz - Java Language Architect, Oracle

Compatibility - avoiding breaking existing programs - takes up a lot of his time.
Serialization - big regret.
What Would James Do?

Java SE 8 - modernizing Java - Summer 2013

  • Lambda Expressions
  • Interface Evolution
  • Bulk Collection Operations

Alonzo Church - 1930’s Lambda Calculus

Language features are only enablers.
Better libraries are what gets things done.

In 1995, few mainstream languages supported closures.
Today, Java is just about the last holdout. (Even C++ added them recently.)

In 1995, everything was sequential. (Single processor.) For loops, iterators, etc.
In today’s world, that’s an impediment to parallelizing code.

Mutability was the default.
Today, that’s the wrong default.

External Iteration
There’s a lot of accidental complexity in a for loop that’s mutating items in a collection. We want to separate the “what” from the “how".

Internal Iteration
Don’t ask me, just do it.
The library can use parallelism, out-of-order, laziness.
(This is an argument for ForEach() as a method. - Bill S.)

Lambda Expressions

  • Anonymous method
  • Compiler can often infer parameter types from context
  • Method references are supported
  • Treat code as data

What is the type of a lambda?

  • C# has delegates, etc.
  • Doing it in the VM would have been disruptive.
  • Historically used single-method interfaces to model functions.

Functional Interfaces

  • The type of a lambda expression is conditional on what we assign it to.
    Predicate<String> isEmpty = s -> s.isEmpty();
    (Don’t have to say that s is a String.)
    Runnable r = () -> { System.Out.printLn("Boo!"); };
  • The obvious answer (just add function types) was wrong - would have divided the libraries into old & new libraries (like generics in C# - Bill S.)

Problem - Interface Evolution
If we had lambdas when we designed Collection libraries, our libraries would look nothing like they do.
New feature: default methods (similar to abstract)
  You can add a virtual interface method with a default implementation.
  If an existing implementation doesn’t have the new method, it still works (using the default implementation).
  Multiple inheritance of behavior, not state.
  This is not full-blown traits (like Scala).
  Not designed for monkey-patching.

It’s All About the Libraries
Best way to evolve the platform - cheaper, more scalable, etc.

Lambdas Enable Better APIs
Boundary between client and library becomes more permeable
The library owns the “how", the client owns the “what”
Key effect on APIs is: more composability
  You want to build behavior by composition.

Sorting today: The code reads nothing like the problem statement.
You want the code to look as much as possible like the requirements.

Bulk operations on Collections
Exposing aggregate operations on collections.
(Very LINQ-y.)
You have to ask for parallelism, but you don’t have to ask very loudly.

Guy Steele’s talk on How to Think about Parallel Programming: Not!
(Highly recommended by presenter.)

Streams
New abstraction
Stream of values - not a data structure
Pipelined
Lazy
(like IEnumerable in C# - Bill S.)

Parallelism
Let the experts write the libraries, and let the libraries handle the hard problems.
JDK7 added Fork/Join - lots of boilerplate code
parallelStream() lets you do map-reduce easily.

Aggregation
sum() is a special case of what functional languages tend to call fold
Reductions work really nicely in parallel.
Composable.

Gentle push towards a more functional style of programming.
People will have to unlearn some things. Want pure methods, immutability, etc.

Tuples coming!?

Permalink 07:40:59 pm, by truewill Email , 490 words, 179 views   English (US)
Categories: C#, .NET, Conventions, Functional Programming, goto2013

GOTO notes: Democratizing Machine Learning

Democratizing Machine Learning With C# - LINQ goes ML
Erik Meijer

(There are almost certainly transcription errors here; these are entirely my fault. - Bill S.)

Problem: Should I play golf or not?
Streaming, real-time weather information
Want to find Func<Weather, Play>
Once I have this, I can create a real-time LINQ query using IObservable<Weather> as an input.

But I don’t know how to write that function.

What I have is a history of the weather and my calendar history of golf-playing.
I can create a Dictionary<Weather, Play> of examples (test cases).

ML is nothing more than a computer doing test-driven development.

A dictionary is a representation of a finite function.
You want a function that works for all arguments.

People have data and want to write a query to extract data from it, typically.
But how do I group the data? How do I order it? How do I filter it?
They could circle the data points on a whiteboard, but may not be able to write the query.

From example data, you’re generating a function.

If you thought the impedance mismatch between (relational) queries and programs was large, then the mismatch between ML and programming is astronomical.

A query has data and a function, and gives you an answer.
ML has data and answers, and gives you a function.

BigML - super-easy way to do ML (commercial - ka-ching!)
Supports Actional Model Download - gives you the function you’re looking for in your choice of languages.

Writing a decision tree algorithm isn’t that hard.
If you write your algorithms using queries, then you can parallelize them.

Bayes
Mathematicians are extremely sloppy.
Bayes is just monads.
When they talk about vectors, they really mean classes.
Play -> Distribution(Outlook) is the probability monad.
P(Weather|Play) is the way mathematicians represent this.

A monad is a type constructor with some functions.

Probability Monad
Type constructor:
Distribution<T> = List<Pair<T, double>>
Unit:
Return :: T -> Distribution<T>
Return t = [(t,1.0)]
Bind:
Bind :: Dist<S> x (S -> Dist<T>) –> Dist<T>
Bind ds f = [ (t, ps*pt) | (s,ps) <- ds, (t,pt) <- f(s) ]

The trick of functional programming is let the types do the work for you.
I do what the types tell me.

From our data we can compute P(Outlook|Play), P(Temperature|Play), etc.
But we want to know P(Play|Weather).

Bayes’ Theorem gives us a way to invert monadic functions.
It holds for any monad.
There’s a type error in it, though.

Given:
a :: A, f:: A -> B
Want:
g :: A -> AxB
g = a -> (a, f(a))

Using n for intersection, we can type-correct Bayes:
P(BnA|A) = P(B) * P(BnA| B)/P(A)

Permalink 07:32:27 pm, by truewill Email , 824 words, 131 views   English (US)
Categories: C#, IoC, .NET, Conventions, SQL, Web, Functional Programming, goto2013

GOTO notes: Open Systems

Keynote: Open Systems - Actors and Cloud
Erik Meijer

(His new company is Applied Duality - watch for great things!)

http://thisisindexed.com/
http://en.wikipedia.org/wiki/Henk_Barendregt
Mike Lee’s “One Minute” keynote

To hack, to help other developers, to solve real problems, to fix the world.

Duality - things have opposite aspects, but you should look at them as one whole. They’re not good or bad.

Nothing’s more different from SQL than Rx.

Question the relational database.

Does your data really look like this? It’s tied to the physical format.
Duality between Modelers (DB designers) & Developers. Plants vs. Zombies.
Modelers always talk about Nouns. Customers, Orders, Line Items.
There’s no reuse in SQL. There’s no difference between a table and a type.
There’s no abstraction.
It’s statically typed. It’s baked in to the layout of your files.
In order to make your database run fast, you have to have intimate knowledge of your data storage (to do joins, etc.).
It’s not really declarative, because you have to know what keys to join on. It doesn’t allow nested results, so you have to have the GROUP BY column in the SELECT. WITH RECURSIVE is another example - you have to explicitly write it out.
Declarative is relative - C is declarative to an assembly programmer.
Optimization - query plan - consultants tweak the SQL to get the plan you wanted - but why can’t you tweak the query plan directly? It’s just function pipeline.
ACID is also a lie - you set your transaction isolation level to get performance. Why so many hints?

The closed world assumption is the presumption that what is not currently known to be true is false.

The fallacies of Distributed Computing - well known

The fallacies of Declarative Computing
1. Exceptions do not exist.
2. Statistics are precise.
3. Memory is infinite.
4. There are no side-effects.
5. Schema doesn’t change.
6. There is one developer
7. Compilation time is free.
8. The language is homogeneous.

Once you go outside of SQL, the optimizer doesn’t know anything about this.

It’s really a B-tree. Just give me that and I’ll be happy! Leaky abstractions are a good thing! Then you know the performance characteristics. We don’t need all these layers of crust. Sometimes you need to go under the hood.

JavaScript people should not use futures, they should use comonads.
Task<T> is a co-monad in C#. See ContinueWith(Func<Task<T>, TResult>).
Avoids the need for multiple futures (one for failure case, etc.)

Comonads are gumball machines.
.Result - remove a gumball
Once out, never in. (It’s not hygenic.)
.Select (s/b map) - turns a gumball machine into a beer machine WITHOUT taking the gumballs out of the machine!
Comonad - .Franchise() - can create a machine of gumball machines
Can compose Franchise with Select using ContinueWith to create a beer machine franchise.

ContinueWith is the duality of SelectMany (aka flatmap).

Developers like Verbs.
We want to perform operations on things, while hiding how they’re accomplished.
We want to swap out implementations of interfaces. You can’t do that in SQL.
ConcurrentDictionary<K,V> implements the same interfaces as Dictionary<K,V>. If single-core code was written against the original interface, it would still work with multiple cores.
A dictionary is nothing more than a database.

Dependency Injection is a work-around for a deficiency in our programming languages.

Moving to the cloud - collections as a service (data structures that live in the cloud) - access with async/await (dream)

Developers don’t want REST. They want to program against objects. The first thing you do is write a wrapper. With an API, you can change the underlying implementation; they’re more abstract.

Actor Framework for Windows (JavaScript) - open source
Similar to database transactions and recovery - intercept all the mutations so you can replay the log (I think this is ActorFx, AKA Ax)
An Actor is something where I send it updates and out come a stream of changes. In Rx we call that a Subject. (The streams are IObservable.)
A UI is also a Subject. As is a web browser, as is a web server.
Dream: Compose communicating stream processors to build applications. Like Erlang, only language-independent. Rx everywhere.
Like Yahoo Pipes but non-visual (developers like code).
Trying to move Rx to an Apache project.
Need to wrap existing services into these Subjects so we can have a consistent interface.

Reactive Message Queue (Qx)
Decouple producer & consumer in space & time.
Either can be offline.
We don’t care about type systems.

Comonads are used when you have an asynchronous communication that gives you one value; Rx is used when you … that gives you multiple values.

Pat Helland’s article on Condos & Clouds

You Are The Subject.

Get rid of all the noise and just give me essence.
One of the great things about the NoSQL movement is that developers are now empowered.

04/29/12

Permalink 01:22:33 pm, by truewill Email , 214 words, 11454 views   English (US)
Categories: C#, Books, WTF, Design, Functional Programming

Ceremony

We’re still working through Seven Languages in Seven Weeks in our lunch-and-learn book club at work. In the Cedar Rapids .Net Users Group book club, we’re almost finished with Real-World Functional Programming with examples in F# and C#.

And I’m learning that I hate ceremony.

  • When writing Scala, I have to enclose blocks in curly braces. I don’t in F# or Ruby. Why?
  • For the Nth time, I forget to end an Erlang function with a dot. I get an obscure compiler error.

In my day job, I write mostly C#. I’m used to that - typing curly braces, ending lines with semicolons, declaring types - it’s automatic. A good IDE helps a lot.

More and more, I wonder why the compiler doesn’t do more of this. F# is also a strongly-typed .NET language, and it can infer most types.

I’m starting to appreciate the simplicity of the F#/Ruby way of doing things.

As a side note, why are Apple and Microsoft bringing back the Bad Old Days of memory management? I have no desire to code in Objective-C or C++. I don’t want to use COM. Even the Linux folks seem to have a love affair with C/C++. Give me memory management. Give me modern languages.

And give me less ceremony.

:: Next Page >>

Development Central

Development Central is the blog of Bill Sorensen, a professional software developer. Much of this will relate to C#, .NET, and OOP in general.

Disclaimer
These postings are provided "AS IS" with no warranties and confer no rights.

| Next >

Search

Categories

Linkblog

b2evolution

contributors

XML Feeds

What is RSS?

Who's Online?

  • Guest Users: 17

powered by b2evolution free blog software