Lost Password? No account yet? Register
Member Area

Software Installation Technologies

May 17th

Home arrow Community arrow External Blogs arrow ClickOnce

ClickOnce
Chris Smith's completely unique view
F#, more F#, and maybe other stuff.

  • F# in 20 Minutes ? Part II

    Now that we have covered the basics, in minutes 8 - 14 we will cover the foundational concepts and types in F#. By the end of Part II you should be able to read F# code reasonably well.

    Immutability

    You may have noticed that I?ve been using the term ?value? to refer to identifiers rather than ?variable?. The reason for this is that types in F# are immutable by default, meaning that once they are created they cannot be changed. This may seem like a severe limitation, but immutability actually prevents some classes of bugs. In addition, immutable data is inherently thread safe meaning you don?t need to worry about sync locks in order to make your code parallelizeable. I?ll go into asynchronous programming in Part III.

    If you do need to mutate some data however, F# has the mutable keyword. This creates a variable who?s value can be changed by using the left arrow operator (<-).

    > let mutable x = "the original value.";;
    val mutable x : string
    > printfn "x's value is '%s'" x;;
    x's value is 'the original value.'
    val it : unit = ()
    
    > x <- "the new one.";;
    val it : unit = ()
    > printfn "x's value is now '%s'" x;;
    x's value is now 'the new one.'
    val it : unit = ()
     

    Reference values (Microsoft.FSharp.Core.Ref<_>)

    Reference values are another way of being able to mutate data. But rather than having a mutable variable on the stack, a reference cell is a pointer to variable which you modify on the heap. In F# there are some restrictions on where you can use mutable values. (Such as not being able to use them in inner lambdas.) But ref objects can be safely passed around, since they are immutable record values. (Which have a mutable field which can be modified.)

    To use reference cells you use (:=) to assign a new value and (!) to dereference.

    > let refCell = ref 42;;
    val refCell : int ref
    
    > refCell := -1;;
    val it : unit = ()
    
    > !refCell;;
    val it : int = ?1
     

    Modules

    In Part I we just declared values and functions arbitrarily. You might have wondered ?where did they go?, since in C# everything must belong to a class. Though F# can declare the standard .NET classes you are familiar with, it also has the notion of a module; which is a collection of values, functions, and types. (Contrast this with a namespace, which can only contain types.)

    This is how we were able to access ?List.map?. In the F# library (FSharp.Core.dll) there is a module named ?List? and on that is a function named ?map?.

    Modules serve as a way of encapsulating code when you are rapid prototyping without needing to spend the time to design a strict object-oriented type hierarchy.  To declare your own module, you can use the module keyword. In the example we will associate a mutable variable with the module, which will serve as a global variable.

    module ProgramSettings =
        let version = "1.0.0.0"
        let debugMode = ref false
        
    module MyProgram =
        
        do printfn "Version %s" ProgramSettings.version
        
        // You can also open modules like you can a namespace, which
        // brings all their functions and values into scope.
        open ProgramSettings
        debugMode := true
    
    

     

    Tuples

    A tuple (pronounced 'two-pull') is an ordered collection of values treated like an atomic unit. Traditionally if you wanted to pass around a group of semi-related values you would need to create a struct or class, or perhaps rely on ?out? parameters. A tuple allows you to keep things organized by grouping related values together without introducing a new type.

    To define a tuple, simply enclose the group of values in parentheses separate the individual components by commas.

    > let tuple = (1, false, "text");;
    val tuple : int * bool * string
    
    > let getNumberInfo (x : int) = (x, x.ToString(), x * x);;
    val getNumberInfo : int -> int * string * int
    
    > getNumberInfo 42;;
    val it : int * string * int = (42, "42", 1764)

     

    Functions can even take tuples as arguments.

    > let printBlogInfo (owner, title, url) = printfn "%s's blog [%s] is online at '%s'" owner title url;; 
    val printBlogInfo : string * string * string -> unit
    > let myBlog = ("Chris", "Completely Unique View", "http://blogs.msdn.com/chrsmith");;
    val myBlog : string * string * string
    
    > printBlogInfo myBlog;;
    Chris's blog [Completely Unique View] is online at 'http://blogs.msdn.com/chrsmith'
    val it : unit = ()
     

    Function Currying

    A novel feature F# provides is the ability to provide a subset of a function?s parameters, and bind a new value to the partial application. This is known as function currying. For example, consider a function which takes three integers and returns their sum. We can curry that function, by fixing the first parameter to be 10, resulting in a function that only takes two integers and returns their sum plus 10.

    > let addThree x y z = x + y + z;;
    val addThree : int -> int -> int -> int
    
    > let addTwo x y = addThree 10 x y;;
    val addTwo : int -> int -> int
    
    > addTwo 1 1;;
    val it : int = 12
    

     

    Discriminated Unions

    Consider the following enumeration:

    enum CardSuit { Spade = 1, Club = 2, Heart = 3, Diamond = 4};

    Logically there can only be one possible suit for a card, but since an enum is just an integer behind the scenes you don?t actually know if the value is valid or not. For example, in C# you can write:

    CardSuit invalid1 = (CardSuit) 9000;
    CardSuit invalid2 = CardSuit.Club | CardSuit.Diamond;

    This will break any code that expects the enum to only have one of the four card values. Also, Consider the case where you need to extend an enum. (Example borrowed from Jomo?s blog.)

    enum Title { Mr, Mrs }

    The Title enum works great, but imagine that later you add a ?Ms? value. Now every single switch statement you had is a potential bug. You can try to fix up all your code, but you are bound to miss something.

    Enums provide a way to easily express some concepts but don?t provide enough compiler checks to prevent errors. Enter the Discriminated Union. A Discriminated Union in F# is a type which can only have one of a set of values, known as data tags. For example, consider a discriminated union for all Microsoft employees.

    type MicrosoftEmployee =
        | BillGates
        | SteveBalmer
        | Worker of string
        | Lead   of string * MicrosoftEmployee list

    If you have an instance of type MicrosoftEmployee, you know that it can only be one of {BillGates, SteveBalmer, Worker, or Lead}. Moreover, if it is a Worker then you know that there is an string associated with that data tag, probably corresponding to a name. We can create discriminated unions easily and use pattern matching - described later - to match their values.

    let myBoss = Lead("Yasir", [Worker("Chris"); Worker("Matteo"); Worker("Santosh")])
    
    let printGreeting (emp : MicrosoftEmployee) =
        match emp with
        | BillGates   -> printfn "Hello, Bill"
        | SteveBalmer -> printfn "Hello, Steve"
        | Worker(name) | Lead(name, _) 
                      -> printfn "Hello, %s" name

    Now what if we extend that Discriminated Union?

    type MicrosoftEmployee =
        | BillGates
        | SteveBalmer
        | Worker of string
        | Lead   of string * MicrosoftEmployee list
        | ChrisSmith

    ? now we get a compiler warning in our match.

    image

    The compiler detected that you didn?t match every tag of the discriminated union and issued a warning.  Checks like this go a long way towards preventing bugs.  To see a few examples of just how powerful discriminated unions are, check out this post.

     

    Pattern Matching

    Pattern matching is like a powerful switch statement, allowing you to branch control flow. You can do more than just comparing a value against a constant however, pattern matching allows you to also capture new values. Such as in our previous example, we bound the identifier ?name? when matching against discriminated union tags.

    let printGreeting (emp : MicrosoftEmployee) =
        match emp with
        | BillGates   -> printfn "Hello, Bill"
        | SteveBalmer -> printfn "Hello, Steve"
        | Worker(name) | Lead(name, _) 
                      -> printfn "Hello, %s" name

    Pattern matching can also match against the ?structure? of the data. So we can match against list elements joined together. (Remember that x :: y means that x is an single element of the list, and y is list of items after it. Also ?[]? represents the empty list.)

    let listLength alist =
        match alist with
        | [] -> 0
        | a :: [] -> 1
        | a :: b :: [] -> 2
        | a :: b :: c :: [] -> 3
        | _ -> failwith "List is too big!" 

    At the end of the pattern match we had a wildcard ?_?, which matches anything. So if variable ?alist? had more than three items, the last pattern clause would execute and throw an exception. Pattern matching also allows you to execute an arbitrary expression to determine if the pattern is matched. (If that expression evaluates to false, then the clause is not matched.)

    let isOdd x =
        match x with
        | _ when x % 2 = 0 -> false
        | _ when x % 2 = 1 -> true

    You can even pattern match using a dynamic type test:

    let getType (x : obj) =
        match x with
        | :? string    -> "x is a string"
        | :? int       -> "x is an int"
        | :? Exception -> "x is an exception"

     

    Records

    Records are a lightweight syntax for declaring a type with several public properties. One advantage of records is that by using the type inference system the compiler will figure out the type of the record by you simply setting its values. So for example, once we define the record type ?Address? we don?t need to annotate a type when declare an instance of it. The compiler knows that only one record type exists with fields Name, Address, and Zip and infers that to be the type you meant.

    type Address = { Name : string; Address : string; Zip : int}    
    let whiteHouse = {Name = "The White House"; Address = "1600 Pennsylvania Avenue"; Zip = 20500}

     

    Forward Pipe Operator

    I love this guy. The Forward pipe operator is simply defined as:

    let (|>) x f = f x

    And has a type signature:

    'a -> ('a -> 'b) -> 'b

    Which translates to: given a generic type 'a, and a function which takes an 'a and returns a 'b, then return the application of the function on the input.

    Rather than explaining this, let me give you an example of where it can be used:

    // Take a number, square it, then convert it to a string, then reverse that string
    let square x         = x * x
    let toStr (x : int)  = x.ToString()
    let rev   (x : string) = new String(Array.rev (x.ToCharArray()))
    
    // 512 -> 1024 -> "1024" -> "4201"
    let result = rev (toStr (square 512))

    The code is very straight forward, but notice just how unruly the syntax looks. All we want to do is take the result of one computation and pass that to the next computation. We could rewrite it by introducing a series of new variables:

    let step1 = square 512
    let step2 = toStr step1
    let step3 = rev step2
    let result = step3

    But now you need to keep all those temporary variables straight.  What the (|>) operator does is take a value, and 'forward it' to a function, essentially allowing you to specify the parameter of a function before the function call. This dramatically simplifies F# code by allowing you to pipe functions together, where the result of one is passed into the next. So to use the same example the code can be written clearly as:

    let result = 512 |> square |> toStr |> rev

     

    Sequences (System.Collections.Generic.IEnumerator<_>)

    A Sequence or ?seq? in F# is just another name for System.Collections.Generic.IEnumerator, but it has a special place in the F# world. Unlike lists or arrays, Sequences can specify infinite series. Only the current value is stored in memory, and once the sequence computes the next item, the current value is forgotten. For example, the following code snippet generates an sequence of all integers.

    let allIntegers = Seq.init_infinite (fun i -> i)

     

    Collections (Seq, List, Array)

    When you need a collection of values in F# you have a least three good options ? Arrays, Lists, and Sequences. Each has their own advantages. Built into the F# library are a series of modules designed for each of these collections. You can use Visual Studio?s intellisense to explore the methods and what they do, but I?ll go into the most common ones here.

    iter. The ?iter? function iterates through each item in the collection. This is identical to a ?foreach? loop. The following prints every item in a list:

    List.iter (fun i -> printfn "Has element %d" i) [1 .. 10]

    map. Like I described in Part I, map transforms a collection based on a function. The following maps an array of integers to their string representations.

    Array.map (fun (i : int) -> i.ToString()) [| 1 .. 10 |]

    fold. Fold is used to take the collection and fold it into a single value, or reduce it. Like Iter and Map it takes a function which will be applied to each element in the collection, but unlike iter and map that function takes an additional ?accumulator? parameter. As fold is continually called it builds up this accumulator parameter based on the previous computation. So for example, given a sequence of integers a valid fold operation would be to sum each item in the series. (And reduce the series to a single value.) Our accumulator would be the sum of all the elements we have seen before.

    Seq.fold (fun acc i -> i + acc) 0 { 1 .. 10 }

    Only Seq has a fold method, List and Array have a fold_left and fold_right. The difference is the order in which items are evaluated. fold_left goes left to right while fold_right goes right to left.

     

    Option values (Microsoft.FSharp.Core.Option<_>)

    With such a demphasis placed on mutability, it is difficult to find nulls in F# code since values are always initialized. However, there are times when a ?null? value means more than an uninitialized variable. Sometimes it means the absence of a value. (Option values are similar to nullable types in C#.)

    F# has an ?option type? to represent two states: ?Some? and ?None?. In the following record type Person, the middle initial field may or may not have a value.

    type Person = { First : string; MI : string option; Last : string }
    
    let billg    = {First = "Bill";  MI = Some("H"); Last = "Gates" }
    let chrsmith = {First = "Chris"; MI = None;      Last = "Smith" }
    

     

    Lazy values (Microsoft.FSharp.Core.Lazy<_>)

    Lazy initialization is a term used to describe something which is computed as needed (and not right when it is declared). F# has lazy values, such as the following example where ?x? is an integer, but as part of its evaluation prints ?Computed? to the screen.

    > let x = lazy (printfn "Computed."; 42);;
    val x : Lazy<int>
    
    > let listOfX = [x; x; x];;
    val listOfX : Lazy<int> list
    
    > x.Force();;
    Computed.
    val it : int = 42
    

    You see that the value of x was computed when we called ?Force? and the value 42 was returned. You can use lazy initialization to avoid computing values which aren?t needed or if they are computationally expensive. In addition they are useful when constructing recursive values. For example, consider a discriminated union which represents an circular linked list:

    type InfiniteList =
        | ListNode of int * InfiniteList
        
    let rec circularList = ListNode(1, circularList)

    The value ?circularList? has a reference to itself (representing an infinite loop of ListNodes with value 1). Without lazy initialization declaring this type of value would be impossible. But behind the scenes the compiler uses lazy initialization to make this work.

     

    By now you should have enough exposure to F# primitives to understand the growing number of F# blogs out there. Next, in Part III we will go over advanced topics ? things which F# can do which other .NET languages can?t. Stay tuned!



  • F# in 20 Minutes - Part I

    With the April Refresh hot off the presses, I figured it was time to write a concise, 20-minute introduction to F#. The goal isn't to teach F#, but rather give a quick overview of the language and some of the things you can do with it. I?ll divide the 20 minutes into three parts:

    • Part I - A slow introduction to F#, explaining your first program
    • Part II ? A brisk overview of the foundational types and concepts in F#
    • Part III ? A quick sampling into advanced topics

    So if you're interested in seeing what the fuss is all about, the read on.

    What is F# and why should I learn it?

    F# is a functional programming language built on .NET. Just like C# and VB.NET, F# can take advantage of core libraries such as Windows Presentation Foundation, Windows Communication Foundation, Visual Studio Tools for Office, etc. With F# you can even write XBox games using XNA.

    But just because you can write code in a new language doesn't mean you should. So why use F#? Because being a functional language, F# makes writing some classes of programs much easier than its imperative cousins like C#. Parallel Programming and Language-Oriented Programming are two such domains that can be expressed easily in F#.

    If you?ve ever written a .NET application and found yourself fighting against with the language to get your idea expressed, then perhaps F# is what you?ve been looking for.

    Getting Started

    To get started with F# download and install the latest release (v1.9.4.15) at:
    http://research.microsoft.com/research/downloads/Details/7ac148a7-149b-4056-aa06-1e6754efd36f/Details.aspx?0sr=d

    (In a day or two it will be on the main site http://research.microsoft.com/fsharp/release.aspx)

     

    This will install the F# Project System on top of Visual Studio 2005 and 2008. First, create a new F# Project.

    clip_image001

    Next, add a new F# Source File. By default it will add a lot of ?tutorial? code, delete all that and insert the following into the empty code editor:

    #light
    
    let square x = x * x
    
    let numbers = [1 .. 10]
    
    let squares = List.map square numbers
    
    printfn "N^2 = %A"squares
    
    open System
    Console.ReadKey(true)

    Press F5 to run your program, and you'll see the following:

    clip_image002

    Nothing terribly exciting yet. Let?s will break the code down the code line by line and then see if what?s really going on, but before we do that, let me introduce VFSI.

    F# Interactive for Visual Studio

    The F# Interactive Console (FSI) is what?s know as a ' REPL loop? for Read-Evaluate-Print-Loop. This means: taking a code snippet, compiling and executing it, then printing the results. With it you can rapidly prototype and test your programs. To enable FSI from within Visual Studio (which is called VFSI), open up the Add-in Manager on the Tools menu item.

    clip_image003

    And check "F# Interactive for Visual Studio". Next, highlight the first two lines of the program...

    clip_image004

    ... and then press ALT+Enter. (Actually, you'll need to press ALT+Enter twice if FSI isn't opened already.) You should see a tool window appear with the text:

    image

    What you just did was send the code snippet directly to the F# Interactive session, and the results were displayed to the output. The result was the definition of a function 'square' which takes a type 'int' and returns a type 'int'.

    Next try typing "List.map square [1 .. 2 .. 10];;" into the FSI window. The ';;' indicates to FSI to stop reading in program text and evaluate the result.

    > List.map square [1 .. 2 .. 10];;
    val it : int list = [1; 9; 25; 49; 81]
    

     

    Now we have the ability to easily explore the F# language via FSI, let's go into what that code actually means. I recommend however that you type code inside the Visual Studio code editor and use 'Highlight + ALT + Enter' to send the code to FSI. Otherwise, you will have to retype your code very time you have a typeo. (Which, at least for me, is very often.)

    Language Basics

    #light (OCaml compat)

    F# has its roots in a programming language called OCaml and has the ability to cross compile OCaml code, meaning that it can compile simple OCaml programs unmodified. This ability however, means that F# requires some unsavory syntax by default. #light (pronounced hash-light) is a compiler directive that simplifies the syntax of the language.

    I highly recommend that you keep #light on since most F# code snippets you find will either declare it, or assume that it has been declared.

    let square x = x * x (Type Inference)

    This defines a function called 'square' which squares a number x. Consider for a moment the equivalent C# code:

    public static int square(int x)
    {
        return x * x;
    }

    Whereas C# requires you to specify type information as well as what the function actually returns, the F# compiler simply figures it out for you. This is referred to as type inference.

    From the function signature F# knows that square takes a single parameter named 'x' and that the function would return 'x * x'. (That last thing evaluated in a function body is the ?return value?, so no need for a ?return? keyword.) Since many primitive types support (*) such as byte, uint64, double, etc. F# defaults to int, a signed 32-bit integer.

    Now consider the following code which provides a 'type annotation' for one of the parameters, that is telling the compiler the type to expect. Since x is stated to be of type 'string', and (+) is only defined for taking two strings, then the parameter y must also be a string. And the result of x + y is the concatenation of both strings.

    > let concat (x : string) y = x + y;;
    val concat : string -> string -> string
    
    > concat "Hello, " "World!";;
    val it : string = "Hello, World!"
    

    We'll cover more advanced topics in type inference later. But for now just enjoy the fact that the F# compiler has the smarts to figure out what you mean, and doesn't require any hand holding.

    let numbers = [1 .. 10] (F# Lists)

    The next line simply declares a list of numbers one through ten. If you had typed [|1 .. 10|] that would have created a .NET array of integers. But an F# list is an immutable linked list, which is the backbone of functional programming. Try typing these things into the FSI window (remember to add a ?;;? at the end of each line):

    // Define a list 
    let vowels = ['e'; 'i'; 'o'; 'u'] 
    
    // Attach item to front (cons)
    let cons = 'a' :: vowels 
    
    // Concat two lists
    let sometimes = vowels @ ['y']

    I?ll cover lists more in-depth in Part II.

    let squares = List.map square numbers (List.map and first-order functions)

    Now we have a list of integers ('numbers') and a function ('square'), we want to create a new list where each item is the result of calling our function. In other words, mapping our function to each item in the list.

    Fortunately, List.map does just that. Consider another example:

    > List.map (fun x -> x % 2 = 0) [1 .. 10];;
    val it : bool list
    = [false; true; false; true; false; true; false; true; false; true]
    

    The code (fun i -> i % 2 = 0) defines an anonymous function, called a lambda expression, that has a parameter x and the function returns the result of "x % 2 = 0", which is whether or not x is even.

    Now notice what we just did - we passed a function as a parameter to another function. You simply can't do that in C#. (At least not easily.) But in F# it allowed us to be very expressive and succinct in our code. Passing around functions as values is known as 'first order functions' and is a hallmark of functional programming.

    printfn "N^2 = %A" squares

    printf is a simple and type-safe way to print text to the console window. To get a better feel for how printf works consider this example which prints an integer, floating-point number, and a string:

    > printfn "%d * %f = %s" 5 0.75 ((5.0 * 0.75).ToString());;
    5 * 0.750000 = 3.75
    val it : unit = ()
    

    The %d, %f, and %s are holes for integers, floats, strings. %A may also be used to print anything else.

    Console.ReadKey(true) (.NET Interop)

    The last line in our program was simply a call to System.Console.ReadKey, in order to pause the program before it closed. Since F# is built on top of .Net you can call any .NET library from F# - from Regular Expressions to WinForms. The line ?open System? simply opened the namespace and brought its types into scope, similar to the using keyword of C#.

    Now that we have the absolute basics out of the way, we can move onto the more interesting foundational types and concepts of F# so stay tuned for Part II!



  • Sieve of Eratosthenes in F#

    If you don?t know what the Sieve of Eratosthenes is, don?t worry because a few days ago neither did I. But in short it is a very simple way to calculate prime numbers. Essentially you loop through all integers and keep a list of numbers which are known to be composite. If you encounter a number which isn?t in your list, then it is prime. At which point you add all multiples of that prime number to your list, since you know they are composite.

    There is definitely room for improvement in my implementation, it?s a good example of using F# Computation Expressions to generate a sequence of prime numbers.

    #light

    // Reference System.Core in the v3.5 framework for 'HashSet'
    #r @"C:\Windows\assembly\GAC_MSIL\System.Core\3.5.0.0__b77a5c561934e089\System.Core.dll"

    open System
    open System.Collections.Generic

    let primesUnderOneMillion =
        seq {
            // First prime
           
    yield 2

            let knownComposites = new HashSet<int>()
           
            // Loop through all odd numbers; evens can't be prime
           
    for i in 3 .. 2 .. int 1E6 do
               
               
    // Check if its in our list, if not, its prime
               
    let found = knownComposites.Contains(i)
                if not found then
                    yield
    i

                // Add all multiples of i to our sieve, starting
                // at i and irecementing by i.
               
    do for j in i .. i .. int 1E6 do
                      
    knownComposites.Add(j) |> ignore
        }
       
    printfn "The first 50 primes:"
    Seq.iter (fun p -> printf "\t%d" p) (Seq.take 50 primesUnderOneMillion)

    Console.ReadKey(true)



  • Windows Live Writer

    So I figured I?d check out Windows Live Writer and see what all the buzz is about.

    Update: With a plugin I can now copy from Visual Studio, huzzah!

     

    Con ? Doesn?t preserve syntax highlighting when copying from Visual Studio.
    Pro ? With This Plug-in you can paste code from Visual Studio

    #light
    
    // Simple F# program to exercise fsc.exe
    
    type Season = 
        | Spring 
        | Summer 
        | Fall 
        | Winter
        override this.ToString() =
            match this with
            | Spring -> "Spring"
            | Summer -> "Summer"
            | Fall   -> "Fall"
            | Winter -> "Winter"
    
    let genSeasonTuples monthRange season =
        monthRange |> List.map (fun monthInd -> (monthInd, season))
    
    let monthSeasonList =   (genSeasonTuples  [1 .. 3]  Spring) 
                          @ (genSeasonTuples  [4 .. 6]  Summer) 
                          @ (genSeasonTuples  [7 .. 10] Fall) 
                          @ (genSeasonTuples [11 .. 12] Winter)
    let monthSeasonMap = Map.FromList monthSeasonList
    
    open System
    
    let currentMonth = DateTime.Now.Month
    let currentSeason = monthSeasonMap.[currentMonth]
    
    printfn "Current season = %s" (currentSeason.ToString())
    Console.WriteLine("(press any key)")
    Console.ReadKey(true)

    Pro ? Inserting images is really easy

     



  • A Java to x86 compiler written in F#

    A little personal information about me, I?m currently working on a masters degree at the University of Washington. And a few weeks ago I completed taking a course in  Compiler Construction, where the course project was to build a simple Java compiler that generated x86 executables.

     

    While most students chose to write their compilers in Java or C#, my partner and I wrote ours in F#. Though I won?t be publishing the source code - since it would be too tempting for other students using Appel?s Modern Compiler Implementation in Java ? I would like to point out where F# makes the task of writing a compiler much easier than say, C#. F#?s superiority in this domain can largely be attributed to discriminated unions, as you will soon find out.

     

    Abstract Syntax Tree, Parsers, and Lexer

    The first phase of the compiler was perhaps the easiest to implement in F#. Rather than representing some complex object hierarchy and use polymorphism, the AST can be represented as a series of discriminated unions.

     

    Ast.fs snippet

    type Statement =

        | CompoundStatement of Statement list

        | If of Expression * Statement * Statement          // If ([expr]) [stmt1] else [stmt2]

        | While of Expression * Statement

        | PrintLine of Expression

        | DeclarationOrAssignment of VariableOrType * DeclarationOrAssignmentTail

     

    type Ast =

        | MiniJavaProgram of Ast * Ast list                 // MainClass, ClassDeclaration list

        | MainClass of string * Ast                         // Name, main method

        | MainMethod of string * Type * Statement           // Name of parameter, return type, Statement

        | Class of string * Ast list                        // Name, VarDeclaration or Method

        | DerivedClass of string * string * Ast list        // Name, BaseClass, VarDeclaration or Method

        | VariableDeclaration of string * Type              // Type, name

        | Method of string * Type * Ast list * Ast          // Name, (return) Type, (parameter) Types, MethodBody

        | MethodBody of Statement list * Expression         // Statements, (return) Expression

     

    Parser.fsy snippet

    // These are the rules of the grammar along with the F# code of the

    // actions executed as rules are reduced.  In this case the actions

    // produce data using F# data construction terms.

    start: MiniJavaProgram { $1 }

     

    MiniJavaProgram:

          | MainClass                   { MiniJavaProgram($1, []) }

          | MainClass ClassDeclarations { MiniJavaProgram($1, List.rev $2)}

     

    MainClass:

          | CLASS ID LCURLY MainMethod RCURLY       { MainClass($2, $4) }

     

    MainMethod:

          | PUBLIC STATIC VOID MAIN LPAREN STRING LBRACKET RBRACKET ID RPAREN LCURLY Statement RCURLY

                                                    { MainMethod($9, MainMethodType, $12) }  

    ClassDeclarations:

          | ClassDeclaration                        { [$1] }

          | ClassDeclarations ClassDeclaration      { $2 :: $1 }

     

    ClassDeclaration:

          | CLASS ID LCURLY RCURLY                                     { Class($2, []) }

          | CLASS ID EXTENDS ID LCURLY RCURLY                          { DerivedClass($2, $4, []) }

          | CLASS ID LCURLY ClassMakeupDeclarations RCURLY             { Class ($2, (List.rev $4)) }

          | CLASS ID EXTENDS ID LCURLY ClassMakeupDeclarations RCURLY  { DerivedClass ($2, $4, ($6))}

     

    Type Checking and Symbol Table Generation

    Once we were able to successfully lex and parse source code, we moved onto type checking and generating a symbol table. Like most compilers, we simply used the Visitor design pattern. Which if implemented in C# requires some funky notion of double dispatch. That is, the object being visited has a method called ?visit? or ?accept? which is given a parameter of type Visitor, then, the object being visited calls ?visit? on the visitor passing ?this? as parameter.

     

    The problem with this is that it adds a lot of lame boiler plate code. For example, from the Wikipedia article each ?part? needs to implement the ?accept? method, which in turn calls the ?visit? method on the visitor. Via method overloading, the right visitor method will be called.

    class Engine {

        public void accept(Visitor visitor) {

            visitor.visit(this);

        }

    }

     

    class Body {

        public void accept(Visitor visitor) {

            visitor.visit(this);

        }

    }

     

    In F# with discriminated unions we can write a single ?walker? for the discriminated union types, and keep all the visitation code in one place. (Rather than decorating each node.) This keeps the code much simpler.

     

    Visitor.fs snippet

    type Walker =

     

        // Walks the ast tree with the given visitor

        static member WalkTree (astRoot:Ast, visitor:Visitor) =

            let walker = new Walker(visitor)

            walker.WalkAst(astRoot)

       

        // ... snip ...

     

        // Methods which walk the AST node, calling our visitor methods as nessecary.

        member private this.WalkAst (astNode : Ast) =

            // Pre visit

            visitor.PreVisitAst astNode

           

            let walkAstList (astList : Ast list) = 

                astList |> List.iter (fun nast -> this.WalkAst nast)

           

            let walkStmtList (stmtList : Statement list) =

                stmtList |> List.iter (fun nstmt -> this.WalkStatement nstmt)

           

            // Recurse

            let _ = match astNode with

                    | MiniJavaProgram (ast, astList)    -> this.WalkAst ast; walkAstList astList

                    | MainClass (_, ast)                -> this.WalkAst ast

                    | MainMethod (_, _, stmt)           -> this.WalkStatement stmt

                   

                    | Class (_, astList)                ->  walkAstList astList

                    | DerivedClass (_, _, astList)      ->  walkAstList astList

                   

                    | VariableDeclaration (_, varType)  -> this.WalkType varType

                    | Method (_, retType, astList, ast) -> this.WalkType retType; walkAstList astList; this.WalkAst ast

                    | MethodBody (stmtList, expr)       -> this.WalkExpression expr; walkStmtList stmtList

           

            // Post visit

            visitor.PostVisitAst astNode

     

    Executable Creation

    Generating the assembly code is straightforward by walking your IR nodes and building up a list of ?Instructions?, represented by the following DU:

     

    type Instruction =

        | MovRR of Register * Register

        | MovRI of Register * int

        | MovRM of Register * Memory

        | MovMA of Memory * string

        | MovMR of Memory * Register

        | Comment of string
        | ...
        override this.ToString() =

            match this with

            | MovRR(dstReg, srcReg) -> "  mov " + dstReg.ToString() + ", " + srcReg.ToString()

            | MovRI(dstReg, i)      -> "  mov " + dstReg.ToString() + ", " + i.ToString()

            | MovRM(dstReg, m)      -> "  mov " + dstReg.ToString() + ", " + m.ToString()

            | MovMA(m, address)     -> "  mov " + m.ToString() + ", " + "OFFSET " + address

            | MovMR(m, srcReg)      -> "  mov " + m.ToString() + ", " + srcReg.ToString()

            | Comment note          -> "; " + note

     

    Once the list of instructions has been generated, we then needed to build executables. To link them you could use ml.exe and cl.exe, Microsoft?s C/C++ compiler and linker. But since requiring a second step is lame, we embedded the files into .RESX files and exported them at runtime. While the F# compiler supports this easily, we actually did this part in C# so that we could take advantage of Visual Studio?s Managed Resource Designer.

     

                /// <summary>

                /// Takes the given assembly file and compiles it using the hard-coded boot loader code.

                /// </summary>

                public static string LinkAsmFile(string asmFile)

                {

                      try

                      {

                            // First, dump ML and boot.obj to disk.

                            string tempPath = Path.GetTempPath();

                            string mlExeFile = Path.Combine(tempPath, "ml.exe"); ;

     

                            // Extract the binary resources embedded in this DLL and write to file.

                            WriteFile(MasmResources.MASMResources.ml, mlExeFile);

     

                            // Shell execute ml.exe on the target assembly file.

                            // This produces a .obj file.

                            ProcessStartInfo procStartInfo = new ProcessStartInfo(mlExeFile, "/c /Cx /coff /Zi \"" + asmFile + "\"");

                            procStartInfo.RedirectStandardOutput = true;

                            procStartInfo.UseShellExecute = false;

     

                            Process linkerProc = Process.Start(procStartInfo);

                            linkerProc.WaitForExit();

                            if (linkerProc.ExitCode != 0)

                                  return linkerProc.StandardOutput.ReadToEnd();

                            else

                                  return "";

                      }

                      catch (Exception ex)

                      {

                            return ex.Message;

                      }

                }

     

    Sounds good, but notice that to run our F# compiler you need F#MiniJava.exe (the program), MASMResources.dll (C/C++ linker), and FSharp.Core (F# libraries). That all sounds lame. Fortunately F# has two awesome command line flags that will essentially embed any required assembly references into your executable:

     

    --standalone:

          Statically link the F# library and all transitively referenced DLLs

          that depend on the F# library into the assembly being generated.

          Disables the generation of the F# interface and optimization

          resources and renames all types beginning with Microsoft.FSharp,

          making them private to the target assembly. There must be no

          duplicate type names across any statically linked

          assemblies, no code may use reflection on any of the

          types in those assemblies, and no code may export F# library types

          as part of an API.

    --static-link <string>:

          Statically the given assembly and all transitively referenced DLLs

          that depend on this assembly.  NOTE: use an assembly name e.g. mylib

          , not a DLL name, e.g. mylib.dll.

     

    (Note: many bad things can happen if you do this, from security vulnerabilities to subtle codegen errors. These flags may be removed in a future release of F# and use at your own risk.)

     

    Having taken an undergraduate course in compilers I found the course material to be challenging, but when it came to the project implementation F# made it a breeze.

     



  • Project Euler - Problem 34

    In our continuing series of how to solve