Thursday, August 18, 2011

Ending GSoC

As the firm pencils down date for Google Summer of Code is coming up Monday, I figure I'd better make sure everyone understands what I've done and how to use it.
  • GhcOptions Record
  • This record has fields for every GHC option used by cabal. It replaces the ad hoc lists of strings that were used previously. GhcOptions is also a Monoid instance, so one can mapped more specific options onto basic ones. To allow this instance, lots of fields are wrapped in All, Any, First, and Last constructors, which can be kind of a pain when filling the fields manually, but it's a small price to pay for composability.
  • Repl Function
  • This addition to the Cabal API will start an interpreter session. By default, it will load all files from the library component of a package. If the user passes specific files, the function will try to build the component that contains these files, failing if no such component can be found or if multiple components contain the files. If the user wants to load some specific component, this can be specified by the --component flag. Like other functions in the cabal API, there are number of hooks that the user can specify to change default interpretation behavior: preRepl, replHook, and postRepl. The names kind of speak for themselves.
  • GHCi Support
  • The repl function eventually calls an interpret function defined individually for each compiler. Currently, the only compiler with interpretation defined is GHC. GHC's interpret implimentation also defines a series of macros that overload the standard :load and :reload interpreter commands to call cabal repl --enable-reload on the specified component before loading so as to do whatever preprocessing is necessary.
  • Other Changes to the Cabal API
  • Because GHC's interpret function calls the cabal-install tool, I had to add cabal to the list of known programs (including ar, ghc, ghc-pkg, etc). Also, because interpretation does not necessarily require building every single package, the topologically sorted build order list used in building is no longer sufficient. I've added a compBuildMap field to LocalBuildInfo which maps components to their topologically sorted list of dependencies.
  • Cabal Repl
  • The cabal-install tool now has an repl command:
      Usage: cabal repl [FILENAME] [FLAGS]
            --enable-reload
        Enable file reloading from within an interpreter.
            --disable-reload
        Disable file reloading from within an interpreter.
            --component[=COMPONENT]
        The component of the cabal package to interpret
          Examples:
            cabal repl
          Only library component in the package
            cabal repl Examples/Foo.hs
          Specific file
            cabal repl Bar.hs --component=myexe
          File from specific component
            cabal repl --component=library
          Library contained in the package
        The usual flags apply as well, of course, and the --enable-reload option isn't really meant for anyone except cabal itself to use when reloading packages.

      Oh yeah. Here's the code, and here is the haddock stuff.

      Tuesday, July 12, 2011

      Cabal and GHCi

      For my Summer of Code project, I'm giving cabal an repl command. At first, I thought a simple filename argument would be enough. More complex Cabal packages require a slightly more detailed input, however. Often times, there's more to a Cabal package than a library. Sometimes there's tests. Sometimes an executable. And all these components can have different ways of building themselves. This makes interpretation a bit of a problem. If a file is part of multiple components, which method of building should be used? Both should work, certainly, but choosing a component arbitrarily leaves little control to the user. In cases like this, it seems that cabal repl must take a component as well. Let's give it a flag --component=COMPONENT, where COMPONENT is the name of some library or executable or test suite that's a component in the cabal package. Fair enough.

      But there's more. But what if a user wants to test a component in its entirety, not just one particular file? Here they only need to specify the component. So we say cabal repl --component=COMPONENT. What happens? What's loaded into GHCi? Should nothing be loaded, requiring the user to :load every module he wants to work with? Should everything be loaded, leading to potentially nasty name collisions? Should everything be loaded, but qualified, meaning that users will have to type hilariously long prefixes before all their functions? Should there be yet another flag for what course of action to take? Please, comment and tell me, because I have no idea.

      Saturday, June 25, 2011

      Dumbing down the GhcOptions record

      I've decided to change the approach I'm taking to my Summer of Code project allowing Cabal's code for building with GHC to accommodate interpretation. Previously I had planned make the options record describing the arguments given to GHC very high level, essentially consisting of a mode option (ie executable, library, interpretation, abi-hash, etc) and a stage option (C stage, Template Haskell stage, link stage, etc). A rendering function would translate this pair of options to a list of command line flags.

      Now, the GhcOptions record has become a direct representation of the command line options; every flag used in building Cabal packages can be adjusted individually. Creating the options for a particular mode and stage consists of glueing together (through mappend) a long list of GhcOptions pieces. There's pieces for (among others) build options, c options and verbosity options, allowing much more fine grained control of the options given to GHC.

      Wednesday, June 15, 2011

      Reloading GHCi

      Starting GHCi with Cabal is fairly easy, as the interpreter takes most of the same arguments we've been feeding GHC. But reloading it is a little more difficult. Admittedly, there are a number of things we don't have to do when reloading:

      • All the options given to the interpreter initially seem to come along for the ride with each :reload command, so rendering GHC command line options isn't necessary.
      • Preparing the build directory and auto-generated files already took place when the interpreter was started, so that stuff isn't required for a reload.

      In fact, there are only really two things that need doing. First, there's preprocessing. This is a high level operation not specific to a compiler. At the same time, there's C sources, a step in building restricted to GHC.

      I propose adding a flag (--reload, perhaps) to the repl command to handle these two operations, handled by the high level Build.hs code to perform preprocessing and the compiler specific GHC.hs code to perform C source building. Macros would be defined in GHCi to shadow those commands resulting in code reloading, calling cabal repl FILENAME --reload first.

      To define these macros, I think I'll need to modify GHCi to take a custom .ghci file. I can't just sneakily add to the user one, and piping in the commands through cat or something would look ugly. As it is, I think the best approach is to add an option for an extra .ghci file. Cabal would write its set of macros to shadow default commands to some place in the dist directory, telling GHCi about the the file.

      Tuesday, June 14, 2011

      Almost a Quine

      To allow reloading from GHCi to trigger all the necessary Cabal preparation, I need to define macros to shadow the default commands (:r, :l, etc) . The basic idea for each of these macros is pretty simple:
      • Call whatever programs needed for preparing the package for building
      • Un-define the command so that the default behavior returns
      • Call the default command
      • Respecify the custom implementation
      • Redefine the command
      Each of these macros must thus define itself again, creating a macro within a macro. Naturally, as this behavior is shared by all the overloaded GHCi commands, I created a macro to abstract this pattern. That's a string defining a macro to make a macro to make a macro. While this kind of thing is fine (almost expected) in Lisp, it doesn't work quite so well in Haskell, especially as syntax is just represented as a string.

      Sunday, May 29, 2011

      Starting Google Summer of Code

      This summer I've been given the privilege of working on Cabal for the Google Summer of Code.

      The project, originally described here and here, is to facilitate the interpretation of Cabal packages through GHCi. More specifically, I'll be doing the following:
      • I'll create a GhcOptions record containing all the options needed to build a package. Two options are particularly important:
        • The ghcOptMode option determines what type of build is going on. This could be one of LibMode, ExecMode, AbiHashMode, or GhciMode. The build function performs whatever type of build is necessary based on this context.
        • The ghcOptStage option determines what part of the build process needs to be rendered into a ProgramInvocation. This could be the building of C Sources, pre-building for Template Haskell, building haskell sources, or linking. Based on the current stage, the render function returns a ProgramInvocation with the necessary set of options for GHC.
        This means that the GhcOptions record is used for two separate generic functions, build and render.
      • Creating these records isn't something users should have to do manually, so I'll also be making counterparts to the old buildExe, buildLib, and buildAbiHash functions that bundle the required information into a nice tidy GhcOptions record before sending it off to build. Additionally, there will be an interpret function for bundling the options required for interpretation. A compiler agnostic version of interpret, repl, will appear in the high level Build.hs file, although I don't plan to add support for other compilers.
      • Next I make the repl function accessible as a command, giving it hooks and a CommandUI spec while showing it to cabal-install's long list of possible commands.
      • Then there's all those fun commands in GHCi that require rebuilding (:reload, :load, etc). Fortunately, I can overload these with macros that perform whatever special cabal stuff needs to be done first and feed these on the sly to GHCi.