Wednesday, May 17, 2006

SML hacking tip: Turn off polyEqual warnings

Note: Narrowly targeted Google-food. Skip if you do not program in Standard ML.

Recent versions of Standard ML of New Jersey (SML/NJ) print a message "Warning: calling polyEqual" when you write code that uses polymorphic equality. Here's how to turn it off:

sml -Ccontrol.poly-eq-warn=false

This works in CM mode, as in:

sml -Ccontrol.poly-eq-warn=false -m sources.cm

I am posting this here as Google-food because I just spent an hour Googling and greping around in SML/NJ sources trying to figure this out.

If you're using SML/NJ interactively, then you can also type

Control.polyEqWarn := false

into the read-eval-print loop. However, this doesn't work when you're trying to invoke the SML Compilation manager in "make" mode (-m), because Control is not present in the default linkage environment. (And the SML/NJ documentation does not specify how to add it; or, at least, I haven't figured it out yet.)

More generally, all control flags (all bool refs in Control) can be toggled at the command line. This is documented in the command-line section of the SML Compilation Manager manual. There, we learn that -C can be used to set control parameters. You can get a listing of all control parameters using -S, as follows:

sml -S

Finally, I just want to remark in passing that unlike, say, "match redundant" or "match nonexhaustive", use of polymorphic equality is purely a performance problem, not a probable logic error. It's therefore highly questionable design to enable a warning message for polymorphic equality by default. The default setting should have been off, but available as a profiling/debugging option.

6 comments:

  1. Thanks for the Google-food! You just saved me significant time.

    ReplyDelete
  2. use of polymorphic equality is purely a performance problem, not a probable logic error

    Interesting. Can you explain this a little bit more? Why does use of polymorphic equality slow things down?

    ReplyDelete
  3. Short answer:

    Unlike most other ML operations, polymorphic equality requires consultation of the runtime type of an object. So, you basically have to look up the complete type, and then invoke a general heap-walking function to compare the two object structures, consulting the type of every subcomponent as you recur.

    Long answer:

    ML's carefully designed so that most operations can be applied either to anything (i.e., they are universally polymorphic) or to exactly one type (i.e., they are monomorphic). Given a sound type system, neither universally polymorphic nor monomorphic operations require run-time type information. Therefore, ML doesn't need to attach a type header word to every object in memory.

    (Note that this contrasts with the standard implementation strategy for most object-oriented or dynamically typed languages. In an object-oriented language, you need to look up an object's type at runtime to do dynamic dispatch. In a dynamically typed language, you need to look up the type to do dynamic type checking. In both cases, typical implementations attach a header word to every object.)

    To use polymorphic equality, you have to get a complete type descriptor somehow to the site of the equality comparison. I don't remember how SML/NJ does this, but the obvious choices are (1) add type headers to every object or (2) when calling a function over eqtypes, pass a descriptor for each eqtype variable on the stack. (1) is pretty gross, so I assume real ML implementations do (2) (which, not coincidentally, resembles a really stunted version of how Haskell implements type classes).

    From the programmer's perspective, the alternative to using an eqtype ''a is to use a plain polymorphic type 'a and pass an explicit ('a * 'a) -> bool function. This may seem no better than ''a --- either way you're still passing something on the stack, and invoking a function that may need to inspect a large heap structure. However, the programmer may be able to stash the function somewhere so that it doesn't have to be passed on every call. Also, and probably more importantly, the programmer-provided equality function knows the fully instantiated type of the objects being compared, and was presumably compiled monomorphically, so it doesn't need to look up the types of subcomponents as it recurs down the object structures.

    In principle, a fancy ML compiler could neutralize the overhead of polymorphic equality through aggressive optimizations --- for example, it could generate specialized equality functions at polymorphic eqtype instantiation sites. I think SML/NJ doesn't bother.

    ReplyDelete
  4. Appreciate you posting this; ML idiosyncracies can be obnoxious and this post is a big help for beginners (it can be a scary message to get :).

    ReplyDelete
  5. Many thanks for posting this! I'm just learning ML and wasn't sure if the polyEqual warning was critical.

    ReplyDelete
  6. Thanks! And you're right about the warning status.

    ReplyDelete