never executed always true always false
    1 
    2 
    3 {-# OPTIONS_GHC -Wno-incomplete-uni-patterns #-}
    4 
    5 {-
    6 (c) The GRASP/AQUA Project, Glasgow University, 1993-1998
    7 
    8 \section[Specialise]{Stamping out overloading, and (optionally) polymorphism}
    9 -}
   10 
   11 module GHC.Core.Opt.Specialise ( specProgram, specUnfolding ) where
   12 
   13 import GHC.Prelude
   14 
   15 import GHC.Driver.Session
   16 import GHC.Driver.Ppr
   17 import GHC.Driver.Config
   18 import GHC.Driver.Config.Diagnostic
   19 import GHC.Driver.Env
   20 
   21 import GHC.Tc.Utils.TcType hiding( substTy )
   22 
   23 import GHC.Core.Type  hiding( substTy, extendTvSubstList )
   24 import GHC.Core.Multiplicity
   25 import GHC.Core.Predicate
   26 import GHC.Core.Coercion( Coercion )
   27 import GHC.Core.Opt.Monad
   28 import qualified GHC.Core.Subst as Core
   29 import GHC.Core.Unfold.Make
   30 import GHC.Core
   31 import GHC.Core.Make      ( mkLitRubbish )
   32 import GHC.Core.Unify     ( tcMatchTy )
   33 import GHC.Core.Rules
   34 import GHC.Core.Utils     ( exprIsTrivial, getIdFromTrivialExpr_maybe
   35                           , mkCast, exprType )
   36 import GHC.Core.FVs
   37 import GHC.Core.TyCo.Rep (TyCoBinder (..))
   38 import GHC.Core.Opt.Arity     ( collectBindersPushingCo
   39                               , etaExpandToJoinPointRule )
   40 
   41 import GHC.Builtin.Types  ( unboxedUnitTy )
   42 
   43 import GHC.Data.Maybe     ( mapMaybe, maybeToList, isJust )
   44 import GHC.Data.Bag
   45 import GHC.Data.FastString
   46 import GHC.Data.List.SetOps
   47 
   48 import GHC.Types.Basic
   49 import GHC.Types.Unique.Supply
   50 import GHC.Types.Unique.DFM
   51 import GHC.Types.Name
   52 import GHC.Types.Tickish
   53 import GHC.Types.Id.Make  ( voidArgId, voidPrimId )
   54 import GHC.Types.Var      ( isLocalVar )
   55 import GHC.Types.Var.Set
   56 import GHC.Types.Var.Env
   57 import GHC.Types.Id
   58 import GHC.Types.Error
   59 
   60 import GHC.Utils.Error ( mkMCDiagnostic )
   61 import GHC.Utils.Monad    ( foldlM )
   62 import GHC.Utils.Misc
   63 import GHC.Utils.Outputable
   64 import GHC.Utils.Panic
   65 import GHC.Utils.Trace
   66 
   67 import GHC.Unit.Module( Module )
   68 import GHC.Unit.Module.ModGuts
   69 import GHC.Unit.External
   70 
   71 {-
   72 ************************************************************************
   73 *                                                                      *
   74 \subsection[notes-Specialise]{Implementation notes [SLPJ, Aug 18 1993]}
   75 *                                                                      *
   76 ************************************************************************
   77 
   78 These notes describe how we implement specialisation to eliminate
   79 overloading.
   80 
   81 The specialisation pass works on Core
   82 syntax, complete with all the explicit dictionary application,
   83 abstraction and construction as added by the type checker.  The
   84 existing type checker remains largely as it is.
   85 
   86 One important thought: the {\em types} passed to an overloaded
   87 function, and the {\em dictionaries} passed are mutually redundant.
   88 If the same function is applied to the same type(s) then it is sure to
   89 be applied to the same dictionary(s)---or rather to the same {\em
   90 values}.  (The arguments might look different but they will evaluate
   91 to the same value.)
   92 
   93 Second important thought: we know that we can make progress by
   94 treating dictionary arguments as static and worth specialising on.  So
   95 we can do without binding-time analysis, and instead specialise on
   96 dictionary arguments and no others.
   97 
   98 The basic idea
   99 ~~~~~~~~~~~~~~
  100 Suppose we have
  101 
  102         let f = <f_rhs>
  103         in <body>
  104 
  105 and suppose f is overloaded.
  106 
  107 STEP 1: CALL-INSTANCE COLLECTION
  108 
  109 We traverse <body>, accumulating all applications of f to types and
  110 dictionaries.
  111 
  112 (Might there be partial applications, to just some of its types and
  113 dictionaries?  In principle yes, but in practice the type checker only
  114 builds applications of f to all its types and dictionaries, so partial
  115 applications could only arise as a result of transformation, and even
  116 then I think it's unlikely.  In any case, we simply don't accumulate such
  117 partial applications.)
  118 
  119 
  120 STEP 2: EQUIVALENCES
  121 
  122 So now we have a collection of calls to f:
  123         f t1 t2 d1 d2
  124         f t3 t4 d3 d4
  125         ...
  126 Notice that f may take several type arguments.  To avoid ambiguity, we
  127 say that f is called at type t1/t2 and t3/t4.
  128 
  129 We take equivalence classes using equality of the *types* (ignoring
  130 the dictionary args, which as mentioned previously are redundant).
  131 
  132 STEP 3: SPECIALISATION
  133 
  134 For each equivalence class, choose a representative (f t1 t2 d1 d2),
  135 and create a local instance of f, defined thus:
  136 
  137         f@t1/t2 = <f_rhs> t1 t2 d1 d2
  138 
  139 f_rhs presumably has some big lambdas and dictionary lambdas, so lots
  140 of simplification will now result.  However we don't actually *do* that
  141 simplification.  Rather, we leave it for the simplifier to do.  If we
  142 *did* do it, though, we'd get more call instances from the specialised
  143 RHS.  We can work out what they are by instantiating the call-instance
  144 set from f's RHS with the types t1, t2.
  145 
  146 Add this new id to f's IdInfo, to record that f has a specialised version.
  147 
  148 Before doing any of this, check that f's IdInfo doesn't already
  149 tell us about an existing instance of f at the required type/s.
  150 (This might happen if specialisation was applied more than once, or
  151 it might arise from user SPECIALIZE pragmas.)
  152 
  153 Recursion
  154 ~~~~~~~~~
  155 Wait a minute!  What if f is recursive?  Then we can't just plug in
  156 its right-hand side, can we?
  157 
  158 But it's ok.  The type checker *always* creates non-recursive definitions
  159 for overloaded recursive functions.  For example:
  160 
  161         f x = f (x+x)           -- Yes I know its silly
  162 
  163 becomes
  164 
  165         f a (d::Num a) = let p = +.sel a d
  166                          in
  167                          letrec fl (y::a) = fl (p y y)
  168                          in
  169                          fl
  170 
  171 We still have recursion for non-overloaded functions which we
  172 specialise, but the recursive call should get specialised to the
  173 same recursive version.
  174 
  175 
  176 Polymorphism 1
  177 ~~~~~~~~~~~~~~
  178 
  179 All this is crystal clear when the function is applied to *constant
  180 types*; that is, types which have no type variables inside.  But what if
  181 it is applied to non-constant types?  Suppose we find a call of f at type
  182 t1/t2.  There are two possibilities:
  183 
  184 (a) The free type variables of t1, t2 are in scope at the definition point
  185 of f.  In this case there's no problem, we proceed just as before.  A common
  186 example is as follows.  Here's the Haskell:
  187 
  188         g y = let f x = x+x
  189               in f y + f y
  190 
  191 After typechecking we have
  192 
  193         g a (d::Num a) (y::a) = let f b (d'::Num b) (x::b) = +.sel b d' x x
  194                                 in +.sel a d (f a d y) (f a d y)
  195 
  196 Notice that the call to f is at type type "a"; a non-constant type.
  197 Both calls to f are at the same type, so we can specialise to give:
  198 
  199         g a (d::Num a) (y::a) = let f@a (x::a) = +.sel a d x x
  200                                 in +.sel a d (f@a y) (f@a y)
  201 
  202 
  203 (b) The other case is when the type variables in the instance types
  204 are *not* in scope at the definition point of f.  The example we are
  205 working with above is a good case.  There are two instances of (+.sel a d),
  206 but "a" is not in scope at the definition of +.sel.  Can we do anything?
  207 Yes, we can "common them up", a sort of limited common sub-expression deal.
  208 This would give:
  209 
  210         g a (d::Num a) (y::a) = let +.sel@a = +.sel a d
  211                                     f@a (x::a) = +.sel@a x x
  212                                 in +.sel@a (f@a y) (f@a y)
  213 
  214 This can save work, and can't be spotted by the type checker, because
  215 the two instances of +.sel weren't originally at the same type.
  216 
  217 Further notes on (b)
  218 
  219 * There are quite a few variations here.  For example, the defn of
  220   +.sel could be floated outside the \y, to attempt to gain laziness.
  221   It certainly mustn't be floated outside the \d because the d has to
  222   be in scope too.
  223 
  224 * We don't want to inline f_rhs in this case, because
  225 that will duplicate code.  Just commoning up the call is the point.
  226 
  227 * Nothing gets added to +.sel's IdInfo.
  228 
  229 * Don't bother unless the equivalence class has more than one item!
  230 
  231 Not clear whether this is all worth it.  It is of course OK to
  232 simply discard call-instances when passing a big lambda.
  233 
  234 Polymorphism 2 -- Overloading
  235 ~~~~~~~~~~~~~~
  236 Consider a function whose most general type is
  237 
  238         f :: forall a b. Ord a => [a] -> b -> b
  239 
  240 There is really no point in making a version of g at Int/Int and another
  241 at Int/Bool, because it's only instantiating the type variable "a" which
  242 buys us any efficiency. Since g is completely polymorphic in b there
  243 ain't much point in making separate versions of g for the different
  244 b types.
  245 
  246 That suggests that we should identify which of g's type variables
  247 are constrained (like "a") and which are unconstrained (like "b").
  248 Then when taking equivalence classes in STEP 2, we ignore the type args
  249 corresponding to unconstrained type variable.  In STEP 3 we make
  250 polymorphic versions.  Thus:
  251 
  252         f@t1/ = /\b -> <f_rhs> t1 b d1 d2
  253 
  254 We do this.
  255 
  256 
  257 Dictionary floating
  258 ~~~~~~~~~~~~~~~~~~~
  259 Consider this
  260 
  261         f a (d::Num a) = let g = ...
  262                          in
  263                          ...(let d1::Ord a = Num.Ord.sel a d in g a d1)...
  264 
  265 Here, g is only called at one type, but the dictionary isn't in scope at the
  266 definition point for g.  Usually the type checker would build a
  267 definition for d1 which enclosed g, but the transformation system
  268 might have moved d1's defn inward.  Solution: float dictionary bindings
  269 outwards along with call instances.
  270 
  271 Consider
  272 
  273         f x = let g p q = p==q
  274                   h r s = (r+s, g r s)
  275               in
  276               h x x
  277 
  278 
  279 Before specialisation, leaving out type abstractions we have
  280 
  281         f df x = let g :: Eq a => a -> a -> Bool
  282                      g dg p q = == dg p q
  283                      h :: Num a => a -> a -> (a, Bool)
  284                      h dh r s = let deq = eqFromNum dh
  285                                 in (+ dh r s, g deq r s)
  286               in
  287               h df x x
  288 
  289 After specialising h we get a specialised version of h, like this:
  290 
  291                     h' r s = let deq = eqFromNum df
  292                              in (+ df r s, g deq r s)
  293 
  294 But we can't naively make an instance for g from this, because deq is not in scope
  295 at the defn of g.  Instead, we have to float out the (new) defn of deq
  296 to widen its scope.  Notice that this floating can't be done in advance -- it only
  297 shows up when specialisation is done.
  298 
  299 User SPECIALIZE pragmas
  300 ~~~~~~~~~~~~~~~~~~~~~~~
  301 Specialisation pragmas can be digested by the type checker, and implemented
  302 by adding extra definitions along with that of f, in the same way as before
  303 
  304         f@t1/t2 = <f_rhs> t1 t2 d1 d2
  305 
  306 Indeed the pragmas *have* to be dealt with by the type checker, because
  307 only it knows how to build the dictionaries d1 and d2!  For example
  308 
  309         g :: Ord a => [a] -> [a]
  310         {-# SPECIALIZE f :: [Tree Int] -> [Tree Int] #-}
  311 
  312 Here, the specialised version of g is an application of g's rhs to the
  313 Ord dictionary for (Tree Int), which only the type checker can conjure
  314 up.  There might not even *be* one, if (Tree Int) is not an instance of
  315 Ord!  (All the other specialision has suitable dictionaries to hand
  316 from actual calls.)
  317 
  318 Problem.  The type checker doesn't have to hand a convenient <f_rhs>, because
  319 it is buried in a complex (as-yet-un-desugared) binding group.
  320 Maybe we should say
  321 
  322         f@t1/t2 = f* t1 t2 d1 d2
  323 
  324 where f* is the Id f with an IdInfo which says "inline me regardless!".
  325 Indeed all the specialisation could be done in this way.
  326 That in turn means that the simplifier has to be prepared to inline absolutely
  327 any in-scope let-bound thing.
  328 
  329 
  330 Again, the pragma should permit polymorphism in unconstrained variables:
  331 
  332         h :: Ord a => [a] -> b -> b
  333         {-# SPECIALIZE h :: [Int] -> b -> b #-}
  334 
  335 We *insist* that all overloaded type variables are specialised to ground types,
  336 (and hence there can be no context inside a SPECIALIZE pragma).
  337 We *permit* unconstrained type variables to be specialised to
  338         - a ground type
  339         - or left as a polymorphic type variable
  340 but nothing in between.  So
  341 
  342         {-# SPECIALIZE h :: [Int] -> [c] -> [c] #-}
  343 
  344 is *illegal*.  (It can be handled, but it adds complication, and gains the
  345 programmer nothing.)
  346 
  347 
  348 SPECIALISING INSTANCE DECLARATIONS
  349 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  350 Consider
  351 
  352         instance Foo a => Foo [a] where
  353                 ...
  354         {-# SPECIALIZE instance Foo [Int] #-}
  355 
  356 The original instance decl creates a dictionary-function
  357 definition:
  358 
  359         dfun.Foo.List :: forall a. Foo a -> Foo [a]
  360 
  361 The SPECIALIZE pragma just makes a specialised copy, just as for
  362 ordinary function definitions:
  363 
  364         dfun.Foo.List@Int :: Foo [Int]
  365         dfun.Foo.List@Int = dfun.Foo.List Int dFooInt
  366 
  367 The information about what instance of the dfun exist gets added to
  368 the dfun's IdInfo in the same way as a user-defined function too.
  369 
  370 
  371 Automatic instance decl specialisation?
  372 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  373 Can instance decls be specialised automatically?  It's tricky.
  374 We could collect call-instance information for each dfun, but
  375 then when we specialised their bodies we'd get new call-instances
  376 for ordinary functions; and when we specialised their bodies, we might get
  377 new call-instances of the dfuns, and so on.  This all arises because of
  378 the unrestricted mutual recursion between instance decls and value decls.
  379 
  380 Still, there's no actual problem; it just means that we may not do all
  381 the specialisation we could theoretically do.
  382 
  383 Furthermore, instance decls are usually exported and used non-locally,
  384 so we'll want to compile enough to get those specialisations done.
  385 
  386 Lastly, there's no such thing as a local instance decl, so we can
  387 survive solely by spitting out *usage* information, and then reading that
  388 back in as a pragma when next compiling the file.  So for now,
  389 we only specialise instance decls in response to pragmas.
  390 
  391 
  392 SPITTING OUT USAGE INFORMATION
  393 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  394 
  395 To spit out usage information we need to traverse the code collecting
  396 call-instance information for all imported (non-prelude?) functions
  397 and data types. Then we equivalence-class it and spit it out.
  398 
  399 This is done at the top-level when all the call instances which escape
  400 must be for imported functions and data types.
  401 
  402 *** Not currently done ***
  403 
  404 
  405 Partial specialisation by pragmas
  406 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  407 What about partial specialisation:
  408 
  409         k :: (Ord a, Eq b) => [a] -> b -> b -> [a]
  410         {-# SPECIALIZE k :: Eq b => [Int] -> b -> b -> [a] #-}
  411 
  412 or even
  413 
  414         {-# SPECIALIZE k :: Eq b => [Int] -> [b] -> [b] -> [a] #-}
  415 
  416 Seems quite reasonable.  Similar things could be done with instance decls:
  417 
  418         instance (Foo a, Foo b) => Foo (a,b) where
  419                 ...
  420         {-# SPECIALIZE instance Foo a => Foo (a,Int) #-}
  421         {-# SPECIALIZE instance Foo b => Foo (Int,b) #-}
  422 
  423 Ho hum.  Things are complex enough without this.  I pass.
  424 
  425 
  426 Requirements for the simplifier
  427 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  428 The simplifier has to be able to take advantage of the specialisation.
  429 
  430 * When the simplifier finds an application of a polymorphic f, it looks in
  431 f's IdInfo in case there is a suitable instance to call instead.  This converts
  432 
  433         f t1 t2 d1 d2   ===>   f_t1_t2
  434 
  435 Note that the dictionaries get eaten up too!
  436 
  437 * Dictionary selection operations on constant dictionaries must be
  438   short-circuited:
  439 
  440         +.sel Int d     ===>  +Int
  441 
  442 The obvious way to do this is in the same way as other specialised
  443 calls: +.sel has inside it some IdInfo which tells that if it's applied
  444 to the type Int then it should eat a dictionary and transform to +Int.
  445 
  446 In short, dictionary selectors need IdInfo inside them for constant
  447 methods.
  448 
  449 * Exactly the same applies if a superclass dictionary is being
  450   extracted:
  451 
  452         Eq.sel Int d   ===>   dEqInt
  453 
  454 * Something similar applies to dictionary construction too.  Suppose
  455 dfun.Eq.List is the function taking a dictionary for (Eq a) to
  456 one for (Eq [a]).  Then we want
  457 
  458         dfun.Eq.List Int d      ===> dEq.List_Int
  459 
  460 Where does the Eq [Int] dictionary come from?  It is built in
  461 response to a SPECIALIZE pragma on the Eq [a] instance decl.
  462 
  463 In short, dfun Ids need IdInfo with a specialisation for each
  464 constant instance of their instance declaration.
  465 
  466 All this uses a single mechanism: the SpecEnv inside an Id
  467 
  468 
  469 What does the specialisation IdInfo look like?
  470 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  471 
  472 The SpecEnv of an Id maps a list of types (the template) to an expression
  473 
  474         [Type]  |->  Expr
  475 
  476 For example, if f has this RuleInfo:
  477 
  478         [Int, a]  ->  \d:Ord Int. f' a
  479 
  480 it means that we can replace the call
  481 
  482         f Int t  ===>  (\d. f' t)
  483 
  484 This chucks one dictionary away and proceeds with the
  485 specialised version of f, namely f'.
  486 
  487 
  488 What can't be done this way?
  489 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  490 There is no way, post-typechecker, to get a dictionary for (say)
  491 Eq a from a dictionary for Eq [a].  So if we find
  492 
  493         ==.sel [t] d
  494 
  495 we can't transform to
  496 
  497         eqList (==.sel t d')
  498 
  499 where
  500         eqList :: (a->a->Bool) -> [a] -> [a] -> Bool
  501 
  502 Of course, we currently have no way to automatically derive
  503 eqList, nor to connect it to the Eq [a] instance decl, but you
  504 can imagine that it might somehow be possible.  Taking advantage
  505 of this is permanently ruled out.
  506 
  507 Still, this is no great hardship, because we intend to eliminate
  508 overloading altogether anyway!
  509 
  510 A note about non-tyvar dictionaries
  511 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  512 Some Ids have types like
  513 
  514         forall a,b,c. Eq a -> Ord [a] -> tau
  515 
  516 This seems curious at first, because we usually only have dictionary
  517 args whose types are of the form (C a) where a is a type variable.
  518 But this doesn't hold for the functions arising from instance decls,
  519 which sometimes get arguments with types of form (C (T a)) for some
  520 type constructor T.
  521 
  522 Should we specialise wrt this compound-type dictionary?  We used to say
  523 "no", saying:
  524         "This is a heuristic judgement, as indeed is the fact that we
  525         specialise wrt only dictionaries.  We choose *not* to specialise
  526         wrt compound dictionaries because at the moment the only place
  527         they show up is in instance decls, where they are simply plugged
  528         into a returned dictionary.  So nothing is gained by specialising
  529         wrt them."
  530 
  531 But it is simpler and more uniform to specialise wrt these dicts too;
  532 and in future GHC is likely to support full fledged type signatures
  533 like
  534         f :: Eq [(a,b)] => ...
  535 
  536 
  537 ************************************************************************
  538 *                                                                      *
  539 \subsubsection{The new specialiser}
  540 *                                                                      *
  541 ************************************************************************
  542 
  543 Our basic game plan is this.  For let(rec) bound function
  544         f :: (C a, D c) => (a,b,c,d) -> Bool
  545 
  546 * Find any specialised calls of f, (f ts ds), where
  547   ts are the type arguments t1 .. t4, and
  548   ds are the dictionary arguments d1 .. d2.
  549 
  550 * Add a new definition for f1 (say):
  551 
  552         f1 = /\ b d -> (..body of f..) t1 b t3 d d1 d2
  553 
  554   Note that we abstract over the unconstrained type arguments.
  555 
  556 * Add the mapping
  557 
  558         [t1,b,t3,d]  |->  \d1 d2 -> f1 b d
  559 
  560   to the specialisations of f.  This will be used by the
  561   simplifier to replace calls
  562                 (f t1 t2 t3 t4) da db
  563   by
  564                 (\d1 d1 -> f1 t2 t4) da db
  565 
  566   All the stuff about how many dictionaries to discard, and what types
  567   to apply the specialised function to, are handled by the fact that the
  568   SpecEnv contains a template for the result of the specialisation.
  569 
  570 We don't build *partial* specialisations for f.  For example:
  571 
  572   f :: Eq a => a -> a -> Bool
  573   {-# SPECIALISE f :: (Eq b, Eq c) => (b,c) -> (b,c) -> Bool #-}
  574 
  575 Here, little is gained by making a specialised copy of f.
  576 There's a distinct danger that the specialised version would
  577 first build a dictionary for (Eq b, Eq c), and then select the (==)
  578 method from it!  Even if it didn't, not a great deal is saved.
  579 
  580 We do, however, generate polymorphic, but not overloaded, specialisations:
  581 
  582   f :: Eq a => [a] -> b -> b -> b
  583   ... SPECIALISE f :: [Int] -> b -> b -> b ...
  584 
  585 Hence, the invariant is this:
  586 
  587         *** no specialised version is overloaded ***
  588 
  589 
  590 ************************************************************************
  591 *                                                                      *
  592 \subsubsection{The exported function}
  593 *                                                                      *
  594 ************************************************************************
  595 -}
  596 
  597 -- | Specialise calls to type-class overloaded functions occurring in a program.
  598 specProgram :: ModGuts -> CoreM ModGuts
  599 specProgram guts@(ModGuts { mg_module = this_mod
  600                           , mg_rules = local_rules
  601                           , mg_binds = binds })
  602   = do { dflags <- getDynFlags
  603 
  604               -- We need to start with a Subst that knows all the things
  605               -- that are in scope, so that the substitution engine doesn't
  606               -- accidentally re-use a unique that's already in use
  607               -- Easiest thing is to do it all at once, as if all the top-level
  608               -- decls were mutually recursive
  609        ; let top_env = SE { se_subst = Core.mkEmptySubst $ mkInScopeSet $ mkVarSet $
  610                                        bindersOfBinds binds
  611                           , se_interesting = emptyVarSet
  612                           , se_module = this_mod
  613                           , se_dflags = dflags }
  614 
  615              go []           = return ([], emptyUDs)
  616              go (bind:binds) = do (binds', uds) <- go binds
  617                                   (bind', uds') <- specBind top_env bind uds
  618                                   return (bind' ++ binds', uds')
  619 
  620              -- Specialise the bindings of this module
  621        ; (binds', uds) <- runSpecM (go binds)
  622 
  623        ; (spec_rules, spec_binds) <- specImports top_env local_rules uds
  624 
  625        ; return (guts { mg_binds = spec_binds ++ binds'
  626                       , mg_rules = spec_rules ++ local_rules }) }
  627 
  628 {-
  629 Note [Wrap bindings returned by specImports]
  630 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  631 'specImports' returns a set of specialized bindings. However, these are lacking
  632 necessary floated dictionary bindings, which are returned by
  633 UsageDetails(ud_binds). These dictionaries need to be brought into scope with
  634 'wrapDictBinds' before the bindings returned by 'specImports' can be used. See,
  635 for instance, the 'specImports' call in 'specProgram'.
  636 
  637 
  638 Note [Disabling cross-module specialisation]
  639 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  640 Since GHC 7.10 we have performed specialisation of INLINABLE bindings living
  641 in modules outside of the current module. This can sometimes uncover user code
  642 which explodes in size when aggressively optimized. The
  643 -fno-cross-module-specialise option was introduced to allow users to being
  644 bitten by such instances to revert to the pre-7.10 behavior.
  645 
  646 See #10491
  647 -}
  648 
  649 
  650 {- *********************************************************************
  651 *                                                                      *
  652                    Specialising imported functions
  653 *                                                                      *
  654 ********************************************************************* -}
  655 
  656 specImports :: SpecEnv
  657             -> [CoreRule]
  658             -> UsageDetails
  659             -> CoreM ([CoreRule], [CoreBind])
  660 specImports top_env local_rules
  661             (MkUD { ud_binds = dict_binds, ud_calls = calls })
  662   | not $ gopt Opt_CrossModuleSpecialise (se_dflags top_env)
  663     -- See Note [Disabling cross-module specialisation]
  664   = return ([], wrapDictBinds dict_binds [])
  665 
  666   | otherwise
  667   = do { hpt_rules <- getRuleBase
  668        ; let rule_base = extendRuleBaseList hpt_rules local_rules
  669 
  670        ; (spec_rules, spec_binds) <- spec_imports top_env [] rule_base
  671                                                   dict_binds calls
  672 
  673              -- Don't forget to wrap the specialized bindings with
  674              -- bindings for the needed dictionaries.
  675              -- See Note [Wrap bindings returned by specImports]
  676              -- and Note [Glom the bindings if imported functions are specialised]
  677        ; let final_binds
  678                | null spec_binds = wrapDictBinds dict_binds []
  679                | otherwise       = [Rec $ flattenBinds $
  680                                     wrapDictBinds dict_binds spec_binds]
  681 
  682        ; return (spec_rules, final_binds)
  683     }
  684 
  685 -- | Specialise a set of calls to imported bindings
  686 spec_imports :: SpecEnv          -- Passed in so that all top-level Ids are in scope
  687              -> [Id]             -- Stack of imported functions being specialised
  688                                  -- See Note [specImport call stack]
  689              -> RuleBase         -- Rules from this module and the home package
  690                                  -- (but not external packages, which can change)
  691              -> Bag DictBind     -- Dict bindings, used /only/ for filterCalls
  692                                  -- See Note [Avoiding loops in specImports]
  693              -> CallDetails      -- Calls for imported things
  694              -> CoreM ( [CoreRule]   -- New rules
  695                       , [CoreBind] ) -- Specialised bindings
  696 spec_imports top_env callers rule_base dict_binds calls
  697   = do { let import_calls = dVarEnvElts calls
  698        -- ; debugTraceMsg (text "specImports {" <+>
  699        --                  vcat [ text "calls:" <+> ppr import_calls
  700        --                       , text "dict_binds:" <+> ppr dict_binds ])
  701        ; (rules, spec_binds) <- go rule_base import_calls
  702        -- ; debugTraceMsg (text "End specImports }" <+> ppr import_calls)
  703 
  704        ; return (rules, spec_binds) }
  705   where
  706     go :: RuleBase -> [CallInfoSet] -> CoreM ([CoreRule], [CoreBind])
  707     go _ [] = return ([], [])
  708     go rb (cis : other_calls)
  709       = do { -- debugTraceMsg (text "specImport {" <+> ppr cis)
  710            ; (rules1, spec_binds1) <- spec_import top_env callers rb dict_binds cis
  711            -- ; debugTraceMsg (text "specImport }" <+> ppr cis)
  712 
  713            ; (rules2, spec_binds2) <- go (extendRuleBaseList rb rules1) other_calls
  714            ; return (rules1 ++ rules2, spec_binds1 ++ spec_binds2) }
  715 
  716 spec_import :: SpecEnv               -- Passed in so that all top-level Ids are in scope
  717             -> [Id]                  -- Stack of imported functions being specialised
  718                                      -- See Note [specImport call stack]
  719             -> RuleBase              -- Rules from this module
  720             -> Bag DictBind          -- Dict bindings, used /only/ for filterCalls
  721                                      -- See Note [Avoiding loops in specImports]
  722             -> CallInfoSet           -- Imported function and calls for it
  723             -> CoreM ( [CoreRule]    -- New rules
  724                      , [CoreBind] )  -- Specialised bindings
  725 spec_import top_env callers rb dict_binds cis@(CIS fn _)
  726   | isIn "specImport" fn callers
  727   = return ([], [])     -- No warning.  This actually happens all the time
  728                         -- when specialising a recursive function, because
  729                         -- the RHS of the specialised function contains a recursive
  730                         -- call to the original function
  731 
  732   | null good_calls
  733   = return ([], [])
  734 
  735   | Just rhs <- canSpecImport dflags fn
  736   = do {     -- Get rules from the external package state
  737              -- We keep doing this in case we "page-fault in"
  738              -- more rules as we go along
  739        ; hsc_env <- getHscEnv
  740        ; eps <- liftIO $ hscEPS hsc_env
  741        ; vis_orphs <- getVisibleOrphanMods
  742        ; let full_rb = unionRuleBase rb (eps_rule_base eps)
  743              rules_for_fn = getRules (RuleEnv full_rb vis_orphs) fn
  744 
  745        ; (rules1, spec_pairs, MkUD { ud_binds = dict_binds1, ud_calls = new_calls })
  746             <- -- debugTraceMsg (text "specImport1" <+> vcat [ppr fn, ppr good_calls, ppr rhs]) >>
  747                 (runSpecM $ specCalls True top_env rules_for_fn good_calls fn rhs)
  748        ; let spec_binds1 = [NonRec b r | (b,r) <- spec_pairs]
  749              -- After the rules kick in we may get recursion, but
  750              -- we rely on a global GlomBinds to sort that out later
  751              -- See Note [Glom the bindings if imported functions are specialised]
  752 
  753               -- Now specialise any cascaded calls
  754        -- ; debugTraceMsg (text "specImport 2" <+> (ppr fn $$ ppr rules1 $$ ppr spec_binds1))
  755        ; (rules2, spec_binds2) <- spec_imports top_env
  756                                                (fn:callers)
  757                                                (extendRuleBaseList rb rules1)
  758                                                (dict_binds `unionBags` dict_binds1)
  759                                                new_calls
  760 
  761        ; let final_binds = wrapDictBinds dict_binds1 $
  762                            spec_binds2 ++ spec_binds1
  763 
  764        ; return (rules2 ++ rules1, final_binds) }
  765 
  766   | otherwise
  767   = do { tryWarnMissingSpecs dflags callers fn good_calls
  768        ; return ([], [])}
  769 
  770   where
  771     dflags = se_dflags top_env
  772     good_calls = filterCalls cis dict_binds
  773        -- SUPER IMPORTANT!  Drop calls that (directly or indirectly) refer to fn
  774        -- See Note [Avoiding loops in specImports]
  775 
  776 canSpecImport :: DynFlags -> Id -> Maybe CoreExpr
  777 -- See Note [Specialise imported INLINABLE things]
  778 canSpecImport dflags fn
  779   | isDataConWrapId fn
  780   = Nothing   -- Don't specialise data-con wrappers, even if they
  781               -- have dict args; there is no benefit.
  782 
  783   | CoreUnfolding { uf_src = src, uf_tmpl = rhs } <- unf
  784   , isStableSource src
  785   = Just rhs   -- By default, specialise only imported things that have a stable
  786                -- unfolding; that is, have an INLINE or INLINABLE pragma
  787                -- Specialise even INLINE things; it hasn't inlined yet,
  788                -- so perhaps it never will.  Moreover it may have calls
  789                -- inside it that we want to specialise
  790 
  791     -- CoreUnfolding case does /not/ include DFunUnfoldings;
  792     -- We only specialise DFunUnfoldings with -fspecialise-aggressively
  793     -- See Note [Do not specialise imported DFuns]
  794 
  795   | gopt Opt_SpecialiseAggressively dflags
  796   = maybeUnfoldingTemplate unf  -- With -fspecialise-aggressively, specialise anything
  797                                 -- with an unfolding, stable or not, DFun or not
  798 
  799   | otherwise = Nothing
  800   where
  801     unf = realIdUnfolding fn   -- We want to see the unfolding even for loop breakers
  802 
  803 -- | Returns whether or not to show a missed-spec warning.
  804 -- If -Wall-missed-specializations is on, show the warning.
  805 -- Otherwise, if -Wmissed-specializations is on, only show a warning
  806 -- if there is at least one imported function being specialized,
  807 -- and if all imported functions are marked with an inline pragma
  808 -- Use the most specific warning as the reason.
  809 tryWarnMissingSpecs :: DynFlags -> [Id] -> Id -> [CallInfo] -> CoreM ()
  810 -- See Note [Warning about missed specialisations]
  811 tryWarnMissingSpecs dflags callers fn calls_for_fn
  812   | isClassOpId fn = return () -- See Note [Missed specialization for ClassOps]
  813   | wopt Opt_WarnMissedSpecs dflags
  814     && not (null callers)
  815     && allCallersInlined                  = doWarn $ WarningWithFlag Opt_WarnMissedSpecs
  816   | wopt Opt_WarnAllMissedSpecs dflags    = doWarn $ WarningWithFlag Opt_WarnAllMissedSpecs
  817   | otherwise                             = return ()
  818   where
  819     allCallersInlined = all (isAnyInlinePragma . idInlinePragma) callers
  820     diag_opts = initDiagOpts dflags
  821     doWarn reason =
  822       msg (mkMCDiagnostic diag_opts reason)
  823         (vcat [ hang (text ("Could not specialise imported function") <+> quotes (ppr fn))
  824                 2 (vcat [ text "when specialising" <+> quotes (ppr caller)
  825                         | caller <- callers])
  826           , whenPprDebug (text "calls:" <+> vcat (map (pprCallInfo fn) calls_for_fn))
  827           , text "Probable fix: add INLINABLE pragma on" <+> quotes (ppr fn) ])
  828 
  829 {- Note [Missed specialisation for ClassOps]
  830 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  831 In #19592 I saw a number of missed specialisation warnings
  832 which were the result of things like:
  833 
  834     case isJumpishInstr @X86.Instr $dInstruction_s7f8 eta3_a78C of { ...
  835 
  836 where isJumpishInstr is part of the Instruction class and defined like
  837 this:
  838 
  839     class Instruction instr where
  840         ...
  841         isJumpishInstr :: instr -> Bool
  842         ...
  843 
  844 isJumpishInstr is a ClassOp which will select the right method
  845 from within the dictionary via our built in rules. See also
  846 Note [ClassOp/DFun selection] in GHC.Tc.TyCl.Instance.
  847 
  848 We don't give these unfoldings, and as a result the specialiser
  849 complains. But usually this doesn't matter. The simplifier will
  850 apply the rule and we end up with
  851 
  852     case isJumpishInstrImplX86 eta3_a78C of { ...
  853 
  854 Since isJumpishInstrImplX86 is defined for a concrete instance (given
  855 by the dictionary) it is usually already well specialised!
  856 Theoretically the implementation of a method could still be overloaded
  857 over a different type class than what it's a method of. But I wasn't able
  858 to make this go wrong, and SPJ thinks this should be fine as well.
  859 
  860 So I decided to remove the warnings for failed specialisations on ClassOps
  861 alltogether as they do more harm than good.
  862 -}
  863 
  864 {- Note [Do not specialise imported DFuns]
  865 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  866 Ticket #18223 shows that specialising calls of DFuns is can cause a huge
  867 and entirely unnecessary blowup in program size.  Consider a call to
  868     f @[[[[[[[[T]]]]]]]] d1 x
  869 where df :: C a => C [a]
  870       d1 :: C [[[[[[[[T]]]]]]]] = dfC[] @[[[[[[[T]]]]]]] d1
  871       d2 :: C [[[[[[[T]]]]]]]   = dfC[] @[[[[[[T]]]]]] d3
  872       ...
  873 Now we'll specialise f's RHS, which may give rise to calls to 'g',
  874 also overloaded, which we will specialise, and so on.  However, if
  875 we specialise the calls to dfC[], we'll generate specialised copies of
  876 all methods of C, at all types; and the same for C's superclasses.
  877 
  878 And many of these specialised functions will never be called.  We are
  879 going to call the specialised 'f', and the specialised 'g', but DFuns
  880 group functions into a tuple, many of whose elements may never be used.
  881 
  882 With deeply-nested types this can lead to a simply overwhelming number
  883 of specialisations: see #18223 for a simple example (from the wild).
  884 I measured the number of specialisations for various numbers of calls
  885 of `flip evalStateT ()`, and got this
  886 
  887                        Size after one simplification
  888   #calls    #SPEC rules    Terms     Types
  889       5         56          3100     10600
  890       9        108         13660     77206
  891 
  892 The real tests case has 60+ calls, which blew GHC out of the water.
  893 
  894 Solution: don't specialise DFuns.  The downside is that if we end
  895 up with (h (dfun d)), /and/ we don't specialise 'h', then we won't
  896 pass to 'h' a tuple of specialised functions.
  897 
  898 However, the flag -fspecialise-aggressively (experimental, off by default)
  899 allows DFuns to specialise as well.
  900 
  901 Note [Avoiding loops in specImports]
  902 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  903 We must take great care when specialising instance declarations
  904 (DFuns like $fOrdList) lest we accidentally build a recursive
  905 dictionary. See Note [Avoiding loops (DFuns)].
  906 
  907 The basic strategy of Note [Avoiding loops (DFuns)] is to use filterCalls
  908 to discard loopy specialisations.  But to do that we must ensure
  909 that the in-scope dict-binds (passed to filterCalls) contains
  910 all the needed dictionary bindings.  In particular, in the recursive
  911 call to spec_imports in spec_import, we must include the dict-binds
  912 from the parent.  Lacking this caused #17151, a really nasty bug.
  913 
  914 Here is what happened.
  915 * Class structure:
  916     Source is a superclass of Mut
  917     Index is a superclass of Source
  918 
  919 * We started with these dict binds
  920     dSource = $fSourcePix @Int $fIndexInt
  921     dIndex  = sc_sel dSource
  922     dMut    = $fMutPix @Int dIndex
  923   and these calls to specialise
  924     $fMutPix @Int dIndex
  925     $fSourcePix @Int $fIndexInt
  926 
  927 * We specialised the call ($fMutPix @Int dIndex)
  928   ==> new call ($fSourcePix @Int dIndex)
  929       (because Source is a superclass of Mut)
  930 
  931 * We specialised ($fSourcePix @Int dIndex)
  932   ==> produces specialised dict $s$fSourcePix,
  933       a record with dIndex as a field
  934       plus RULE forall d. ($fSourcePix @Int d) = $s$fSourcePix
  935   *** This is the bogus step ***
  936 
  937 * Now we decide not to specialise the call
  938     $fSourcePix @Int $fIndexInt
  939   because we alredy have a RULE that matches it
  940 
  941 * Finally the simplifer rewrites
  942     dSource = $fSourcePix @Int $fIndexInt
  943     ==>  dSource = $s$fSourcePix
  944 
  945 Disaster. Now we have
  946 
  947 Rewrite dSource's RHS to $s$fSourcePix   Disaster
  948     dSource = $s$fSourcePix
  949     dIndex  = sc_sel dSource
  950     $s$fSourcePix = MkSource dIndex ...
  951 
  952 Solution: filterCalls should have stopped the bogus step,
  953 by seeing that dIndex transitively uses $fSourcePix. But
  954 it can only do that if it sees all the dict_binds.  Wow.
  955 
  956 --------------
  957 Here's another example (#13429).  Suppose we have
  958   class Monoid v => C v a where ...
  959 
  960 We start with a call
  961    f @ [Integer] @ Integer $fC[]Integer
  962 
  963 Specialising call to 'f' gives dict bindings
  964    $dMonoid_1 :: Monoid [Integer]
  965    $dMonoid_1 = M.$p1C @ [Integer] $fC[]Integer
  966 
  967    $dC_1 :: C [Integer] (Node [Integer] Integer)
  968    $dC_1 = M.$fCvNode @ [Integer] $dMonoid_1
  969 
  970 ...plus a recursive call to
  971    f @ [Integer] @ (Node [Integer] Integer) $dC_1
  972 
  973 Specialising that call gives
  974    $dMonoid_2  :: Monoid [Integer]
  975    $dMonoid_2  = M.$p1C @ [Integer] $dC_1
  976 
  977    $dC_2 :: C [Integer] (Node [Integer] Integer)
  978    $dC_2 = M.$fCvNode @ [Integer] $dMonoid_2
  979 
  980 Now we have two calls to the imported function
  981   M.$fCvNode :: Monoid v => C v a
  982   M.$fCvNode @v @a m = C m some_fun
  983 
  984 But we must /not/ use the call (M.$fCvNode @ [Integer] $dMonoid_2)
  985 for specialisation, else we get:
  986 
  987   $dC_1 = M.$fCvNode @ [Integer] $dMonoid_1
  988   $dMonoid_2 = M.$p1C @ [Integer] $dC_1
  989   $s$fCvNode = C $dMonoid_2 ...
  990     RULE M.$fCvNode [Integer] _ _ = $s$fCvNode
  991 
  992 Now use the rule to rewrite the call in the RHS of $dC_1
  993 and we get a loop!
  994 
  995 
  996 Note [specImport call stack]
  997 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  998 When specialising an imports function 'f', we may get new calls
  999 of an imported function 'g', which we want to specialise in turn,
 1000 and similarly specialising 'g' might expose a new call to 'h'.
 1001 
 1002 We track the stack of enclosing functions. So when specialising 'h' we
 1003 haev a specImport call stack of [g,f]. We do this for two reasons:
 1004 * Note [Warning about missed specialisations]
 1005 * Note [Avoiding recursive specialisation]
 1006 
 1007 Note [Warning about missed specialisations]
 1008 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 1009 Suppose
 1010  * In module Lib, you carefully mark a function 'foo' INLINABLE
 1011  * Import Lib(foo) into another module M
 1012  * Call 'foo' at some specialised type in M
 1013 Then you jolly well expect it to be specialised in M.  But what if
 1014 'foo' calls another function 'Lib.bar'.  Then you'd like 'bar' to be
 1015 specialised too.  But if 'bar' is not marked INLINABLE it may well
 1016 not be specialised.  The warning Opt_WarnMissedSpecs warns about this.
 1017 
 1018 It's more noisy to warning about a missed specialisation opportunity
 1019 for /every/ overloaded imported function, but sometimes useful. That
 1020 is what Opt_WarnAllMissedSpecs does.
 1021 
 1022 ToDo: warn about missed opportunities for local functions.
 1023 
 1024 Note [Avoiding recursive specialisation]
 1025 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 1026 When we specialise 'f' we may find new overloaded calls to 'g', 'h' in
 1027 'f's RHS.  So we want to specialise g,h.  But we don't want to
 1028 specialise f any more!  It's possible that f's RHS might have a
 1029 recursive yet-more-specialised call, so we'd diverge in that case.
 1030 And if the call is to the same type, one specialisation is enough.
 1031 Avoiding this recursive specialisation loop is one reason for the
 1032 'callers' stack passed to specImports and specImport.
 1033 
 1034 Note [Specialise imported INLINABLE things]
 1035 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 1036 What imported functions do we specialise?  The basic set is
 1037  * DFuns and things with INLINABLE pragmas.
 1038 but with -fspecialise-aggressively we add
 1039  * Anything with an unfolding template
 1040 
 1041 #8874 has a good example of why we want to auto-specialise DFuns.
 1042 
 1043 We have the -fspecialise-aggressively flag (usually off), because we
 1044 risk lots of orphan modules from over-vigorous specialisation.
 1045 However it's not a big deal: anything non-recursive with an
 1046 unfolding-template will probably have been inlined already.
 1047 
 1048 Note [Glom the bindings if imported functions are specialised]
 1049 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 1050 Suppose we have an imported, *recursive*, INLINABLE function
 1051    f :: Eq a => a -> a
 1052    f = /\a \d x. ...(f a d)...
 1053 In the module being compiled we have
 1054    g x = f (x::Int)
 1055 Now we'll make a specialised function
 1056    f_spec :: Int -> Int
 1057    f_spec = \x -> ...(f Int dInt)...
 1058    {-# RULE  f Int _ = f_spec #-}
 1059    g = \x. f Int dInt x
 1060 Note that f_spec doesn't look recursive
 1061 After rewriting with the RULE, we get
 1062    f_spec = \x -> ...(f_spec)...
 1063 BUT since f_spec was non-recursive before it'll *stay* non-recursive.
 1064 The occurrence analyser never turns a NonRec into a Rec.  So we must
 1065 make sure that f_spec is recursive.  Easiest thing is to make all
 1066 the specialisations for imported bindings recursive.
 1067 
 1068 
 1069 
 1070 ************************************************************************
 1071 *                                                                      *
 1072 \subsubsection{@specExpr@: the main function}
 1073 *                                                                      *
 1074 ************************************************************************
 1075 -}
 1076 
 1077 data SpecEnv
 1078   = SE { se_subst :: Core.Subst
 1079              -- We carry a substitution down:
 1080              -- a) we must clone any binding that might float outwards,
 1081              --    to avoid name clashes
 1082              -- b) we carry a type substitution to use when analysing
 1083              --    the RHS of specialised bindings (no type-let!)
 1084 
 1085 
 1086        , se_interesting :: VarSet
 1087              -- Dict Ids that we know something about
 1088              -- and hence may be worth specialising against
 1089              -- See Note [Interesting dictionary arguments]
 1090 
 1091        , se_module :: Module
 1092        , se_dflags :: DynFlags
 1093      }
 1094 
 1095 instance Outputable SpecEnv where
 1096   ppr (SE { se_subst = subst, se_interesting = interesting })
 1097     = text "SE" <+> braces (sep $ punctuate comma
 1098         [ text "subst =" <+> ppr subst
 1099         , text "interesting =" <+> ppr interesting ])
 1100 
 1101 specVar :: SpecEnv -> Id -> CoreExpr
 1102 specVar env v = Core.lookupIdSubst (se_subst env) v
 1103 
 1104 specExpr :: SpecEnv -> CoreExpr -> SpecM (CoreExpr, UsageDetails)
 1105 
 1106 ---------------- First the easy cases --------------------
 1107 specExpr env (Type ty)     = return (Type     (substTy env ty), emptyUDs)
 1108 specExpr env (Coercion co) = return (Coercion (substCo env co), emptyUDs)
 1109 specExpr env (Var v)       = return (specVar env v, emptyUDs)
 1110 specExpr _   (Lit lit)     = return (Lit lit,       emptyUDs)
 1111 specExpr env (Cast e co)
 1112   = do { (e', uds) <- specExpr env e
 1113        ; return ((mkCast e' (substCo env co)), uds) }
 1114 specExpr env (Tick tickish body)
 1115   = do { (body', uds) <- specExpr env body
 1116        ; return (Tick (specTickish env tickish) body', uds) }
 1117 
 1118 ---------------- Applications might generate a call instance --------------------
 1119 specExpr env expr@(App {})
 1120   = go expr []
 1121   where
 1122     go (App fun arg) args = do (arg', uds_arg) <- specExpr env arg
 1123                                (fun', uds_app) <- go fun (arg':args)
 1124                                return (App fun' arg', uds_arg `plusUDs` uds_app)
 1125 
 1126     go (Var f)       args = case specVar env f of
 1127                                 Var f' -> return (Var f', mkCallUDs env f' args)
 1128                                 e'     -> return (e', emptyUDs) -- I don't expect this!
 1129     go other         _    = specExpr env other
 1130 
 1131 ---------------- Lambda/case require dumping of usage details --------------------
 1132 specExpr env e@(Lam {})
 1133   = specLam env' bndrs' body
 1134   where
 1135     (bndrs, body)  = collectBinders e
 1136     (env', bndrs') = substBndrs env bndrs
 1137         -- More efficient to collect a group of binders together all at once
 1138         -- and we don't want to split a lambda group with dumped bindings
 1139 
 1140 specExpr env (Case scrut case_bndr ty alts)
 1141   = do { (scrut', scrut_uds) <- specExpr env scrut
 1142        ; (scrut'', case_bndr', alts', alts_uds)
 1143              <- specCase env scrut' case_bndr alts
 1144        ; return (Case scrut'' case_bndr' (substTy env ty) alts'
 1145                 , scrut_uds `plusUDs` alts_uds) }
 1146 
 1147 ---------------- Finally, let is the interesting case --------------------
 1148 specExpr env (Let bind body)
 1149   = do { -- Clone binders
 1150          (rhs_env, body_env, bind') <- cloneBindSM env bind
 1151 
 1152          -- Deal with the body
 1153        ; (body', body_uds) <- specExpr body_env body
 1154 
 1155         -- Deal with the bindings
 1156       ; (binds', uds) <- specBind rhs_env bind' body_uds
 1157 
 1158         -- All done
 1159       ; return (foldr Let body' binds', uds) }
 1160 
 1161 --------------
 1162 specLam :: SpecEnv -> [OutBndr] -> InExpr -> SpecM (OutExpr, UsageDetails)
 1163 -- The binders have been substituted, but the body has not
 1164 specLam env bndrs body
 1165   | null bndrs
 1166   = specExpr env body
 1167   | otherwise
 1168   = do { (body', uds) <- specExpr env body
 1169        ; let (free_uds, dumped_dbs) = dumpUDs bndrs uds
 1170        ; return (mkLams bndrs (wrapDictBindsE dumped_dbs body'), free_uds) }
 1171 
 1172 --------------
 1173 specTickish :: SpecEnv -> CoreTickish -> CoreTickish
 1174 specTickish env (Breakpoint ext ix ids)
 1175   = Breakpoint ext ix [ id' | id <- ids, Var id' <- [specVar env id]]
 1176   -- drop vars from the list if they have a non-variable substitution.
 1177   -- should never happen, but it's harmless to drop them anyway.
 1178 specTickish _ other_tickish = other_tickish
 1179 
 1180 --------------
 1181 specCase :: SpecEnv
 1182          -> CoreExpr            -- Scrutinee, already done
 1183          -> Id -> [CoreAlt]
 1184          -> SpecM ( CoreExpr    -- New scrutinee
 1185                   , Id
 1186                   , [CoreAlt]
 1187                   , UsageDetails)
 1188 specCase env scrut' case_bndr [Alt con args rhs]
 1189   | isDictId case_bndr           -- See Note [Floating dictionaries out of cases]
 1190   , interestingDict env scrut'
 1191   , not (isDeadBinder case_bndr && null sc_args')
 1192   = do { (case_bndr_flt : sc_args_flt) <- mapM clone_me (case_bndr' : sc_args')
 1193 
 1194        ; let sc_rhss = [ Case (Var case_bndr_flt) case_bndr' (idType sc_arg')
 1195                               [Alt con args' (Var sc_arg')]
 1196                        | sc_arg' <- sc_args' ]
 1197 
 1198              -- Extend the substitution for RHS to map the *original* binders
 1199              -- to their floated versions.
 1200              mb_sc_flts :: [Maybe DictId]
 1201              mb_sc_flts = map (lookupVarEnv clone_env) args'
 1202              clone_env  = zipVarEnv sc_args' sc_args_flt
 1203              subst_prs  = (case_bndr, Var case_bndr_flt)
 1204                         : [ (arg, Var sc_flt)
 1205                           | (arg, Just sc_flt) <- args `zip` mb_sc_flts ]
 1206              env_rhs' = env_rhs { se_subst = Core.extendIdSubstList (se_subst env_rhs) subst_prs
 1207                                 , se_interesting = se_interesting env_rhs `extendVarSetList`
 1208                                                    (case_bndr_flt : sc_args_flt) }
 1209 
 1210        ; (rhs', rhs_uds)   <- specExpr env_rhs' rhs
 1211        ; let scrut_bind    = mkDB (NonRec case_bndr_flt scrut')
 1212              case_bndr_set = unitVarSet case_bndr_flt
 1213              sc_binds      = [ DB { db_bind = NonRec sc_arg_flt sc_rhs
 1214                                   , db_fvs  = case_bndr_set }
 1215                              | (sc_arg_flt, sc_rhs) <- sc_args_flt `zip` sc_rhss ]
 1216              flt_binds     = scrut_bind : sc_binds
 1217              (free_uds, dumped_dbs) = dumpUDs (case_bndr':args') rhs_uds
 1218              all_uds = flt_binds `addDictBinds` free_uds
 1219              alt'    = Alt con args' (wrapDictBindsE dumped_dbs rhs')
 1220        ; return (Var case_bndr_flt, case_bndr', [alt'], all_uds) }
 1221   where
 1222     (env_rhs, (case_bndr':args')) = substBndrs env (case_bndr:args)
 1223     sc_args' = filter is_flt_sc_arg args'
 1224 
 1225     clone_me bndr = do { uniq <- getUniqueM
 1226                        ; return (mkUserLocalOrCoVar occ uniq wght ty loc) }
 1227        where
 1228          name = idName bndr
 1229          wght = idMult bndr
 1230          ty   = idType bndr
 1231          occ  = nameOccName name
 1232          loc  = getSrcSpan name
 1233 
 1234     arg_set = mkVarSet args'
 1235     is_flt_sc_arg var =  isId var
 1236                       && not (isDeadBinder var)
 1237                       && isDictTy var_ty
 1238                       && tyCoVarsOfType var_ty `disjointVarSet` arg_set
 1239        where
 1240          var_ty = idType var
 1241 
 1242 
 1243 specCase env scrut case_bndr alts
 1244   = do { (alts', uds_alts) <- mapAndCombineSM spec_alt alts
 1245        ; return (scrut, case_bndr', alts', uds_alts) }
 1246   where
 1247     (env_alt, case_bndr') = substBndr env case_bndr
 1248     spec_alt (Alt con args rhs) = do
 1249           (rhs', uds) <- specExpr env_rhs rhs
 1250           let (free_uds, dumped_dbs) = dumpUDs (case_bndr' : args') uds
 1251           return (Alt con args' (wrapDictBindsE dumped_dbs rhs'), free_uds)
 1252         where
 1253           (env_rhs, args') = substBndrs env_alt args
 1254 
 1255 {-
 1256 Note [Floating dictionaries out of cases]
 1257 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 1258 Consider
 1259    g = \d. case d of { MkD sc ... -> ...(f sc)... }
 1260 Naively we can't float d2's binding out of the case expression,
 1261 because 'sc' is bound by the case, and that in turn means we can't
 1262 specialise f, which seems a pity.
 1263 
 1264 So we invert the case, by floating out a binding
 1265 for 'sc_flt' thus:
 1266     sc_flt = case d of { MkD sc ... -> sc }
 1267 Now we can float the call instance for 'f'.  Indeed this is just
 1268 what'll happen if 'sc' was originally bound with a let binding,
 1269 but case is more efficient, and necessary with equalities. So it's
 1270 good to work with both.
 1271 
 1272 You might think that this won't make any difference, because the
 1273 call instance will only get nuked by the \d.  BUT if 'g' itself is
 1274 specialised, then transitively we should be able to specialise f.
 1275 
 1276 In general, given
 1277    case e of cb { MkD sc ... -> ...(f sc)... }
 1278 we transform to
 1279    let cb_flt = e
 1280        sc_flt = case cb_flt of { MkD sc ... -> sc }
 1281    in
 1282    case cb_flt of bg { MkD sc ... -> ....(f sc_flt)... }
 1283 
 1284 The "_flt" things are the floated binds; we use the current substitution
 1285 to substitute sc -> sc_flt in the RHS
 1286 
 1287 ************************************************************************
 1288 *                                                                      *
 1289                      Dealing with a binding
 1290 *                                                                      *
 1291 ************************************************************************
 1292 -}
 1293 
 1294 specBind :: SpecEnv                     -- Use this for RHSs
 1295          -> CoreBind                    -- Binders are already cloned by cloneBindSM,
 1296                                         -- but RHSs are un-processed
 1297          -> UsageDetails                -- Info on how the scope of the binding
 1298          -> SpecM ([CoreBind],          -- New bindings
 1299                    UsageDetails)        -- And info to pass upstream
 1300 
 1301 -- Returned UsageDetails:
 1302 --    No calls for binders of this bind
 1303 specBind rhs_env (NonRec fn rhs) body_uds
 1304   = do { (rhs', rhs_uds) <- specExpr rhs_env rhs
 1305 
 1306         ; let zapped_fn = zapIdDemandInfo fn
 1307               -- We zap the demand info because the binding may float,
 1308               -- which would invaidate the demand info (see #17810 for example).
 1309               -- Destroying demand info is not terrible; specialisation is
 1310               -- always followed soon by demand analysis.
 1311       ; (fn', spec_defns, body_uds1) <- specDefn rhs_env body_uds zapped_fn rhs
 1312 
 1313        ; let pairs = spec_defns ++ [(fn', rhs')]
 1314                         -- fn' mentions the spec_defns in its rules,
 1315                         -- so put the latter first
 1316 
 1317              combined_uds = body_uds1 `plusUDs` rhs_uds
 1318 
 1319              (free_uds, dump_dbs, float_all) = dumpBindUDs [fn] combined_uds
 1320 
 1321              final_binds :: [DictBind]
 1322              -- See Note [From non-recursive to recursive]
 1323              final_binds
 1324                | not (isEmptyBag dump_dbs)
 1325                , not (null spec_defns)
 1326                = [recWithDumpedDicts pairs dump_dbs]
 1327                | otherwise
 1328                = [mkDB $ NonRec b r | (b,r) <- pairs]
 1329                  ++ bagToList dump_dbs
 1330 
 1331        ; if float_all then
 1332              -- Rather than discard the calls mentioning the bound variables
 1333              -- we float this (dictionary) binding along with the others
 1334               return ([], free_uds `snocDictBinds` final_binds)
 1335          else
 1336              -- No call in final_uds mentions bound variables,
 1337              -- so we can just leave the binding here
 1338               return (map db_bind final_binds, free_uds) }
 1339 
 1340 
 1341 specBind rhs_env (Rec pairs) body_uds
 1342        -- Note [Specialising a recursive group]
 1343   = do { let (bndrs,rhss) = unzip pairs
 1344        ; (rhss', rhs_uds) <- mapAndCombineSM (specExpr rhs_env) rhss
 1345        ; let scope_uds = body_uds `plusUDs` rhs_uds
 1346                        -- Includes binds and calls arising from rhss
 1347 
 1348        ; (bndrs1, spec_defns1, uds1) <- specDefns rhs_env scope_uds pairs
 1349 
 1350        ; (bndrs3, spec_defns3, uds3)
 1351              <- if null spec_defns1  -- Common case: no specialisation
 1352                 then return (bndrs1, [], uds1)
 1353                 else do {            -- Specialisation occurred; do it again
 1354                           (bndrs2, spec_defns2, uds2)
 1355                               <- specDefns rhs_env uds1 (bndrs1 `zip` rhss)
 1356                         ; return (bndrs2, spec_defns2 ++ spec_defns1, uds2) }
 1357 
 1358        ; let (final_uds, dumped_dbs, float_all) = dumpBindUDs bndrs uds3
 1359              final_bind = recWithDumpedDicts (spec_defns3 ++ zip bndrs3 rhss')
 1360                                              dumped_dbs
 1361 
 1362        ; if float_all then
 1363               return ([], final_uds `snocDictBind` final_bind)
 1364          else
 1365               return ([db_bind final_bind], final_uds) }
 1366 
 1367 
 1368 ---------------------------
 1369 specDefns :: SpecEnv
 1370           -> UsageDetails               -- Info on how it is used in its scope
 1371           -> [(OutId,InExpr)]           -- The things being bound and their un-processed RHS
 1372           -> SpecM ([OutId],            -- Original Ids with RULES added
 1373                     [(OutId,OutExpr)],  -- Extra, specialised bindings
 1374                     UsageDetails)       -- Stuff to fling upwards from the specialised versions
 1375 
 1376 -- Specialise a list of bindings (the contents of a Rec), but flowing usages
 1377 -- upwards binding by binding.  Example: { f = ...g ...; g = ...f .... }
 1378 -- Then if the input CallDetails has a specialised call for 'g', whose specialisation
 1379 -- in turn generates a specialised call for 'f', we catch that in this one sweep.
 1380 -- But not vice versa (it's a fixpoint problem).
 1381 
 1382 specDefns _env uds []
 1383   = return ([], [], uds)
 1384 specDefns env uds ((bndr,rhs):pairs)
 1385   = do { (bndrs1, spec_defns1, uds1) <- specDefns env uds pairs
 1386        ; (bndr1, spec_defns2, uds2)  <- specDefn env uds1 bndr rhs
 1387        ; return (bndr1 : bndrs1, spec_defns1 ++ spec_defns2, uds2) }
 1388 
 1389 ---------------------------
 1390 specDefn :: SpecEnv
 1391          -> UsageDetails                -- Info on how it is used in its scope
 1392          -> OutId -> InExpr             -- The thing being bound and its un-processed RHS
 1393          -> SpecM (Id,                  -- Original Id with added RULES
 1394                    [(Id,CoreExpr)],     -- Extra, specialised bindings
 1395                    UsageDetails)        -- Stuff to fling upwards from the specialised versions
 1396 
 1397 specDefn env body_uds fn rhs
 1398   = do { let (body_uds_without_me, calls_for_me) = callsForMe fn body_uds
 1399              rules_for_me = idCoreRules fn
 1400        ; (rules, spec_defns, spec_uds) <- specCalls False env rules_for_me
 1401                                                     calls_for_me fn rhs
 1402        ; return ( fn `addIdSpecialisations` rules
 1403                 , spec_defns
 1404                 , body_uds_without_me `plusUDs` spec_uds) }
 1405                 -- It's important that the `plusUDs` is this way
 1406                 -- round, because body_uds_without_me may bind
 1407                 -- dictionaries that are used in calls_for_me passed
 1408                 -- to specDefn.  So the dictionary bindings in
 1409                 -- spec_uds may mention dictionaries bound in
 1410                 -- body_uds_without_me
 1411 
 1412 ---------------------------
 1413 specCalls :: Bool              -- True  =>  specialising imported fn
 1414                                -- False =>  specialising local fn
 1415           -> SpecEnv
 1416           -> [CoreRule]        -- Existing RULES for the fn
 1417           -> [CallInfo]
 1418           -> OutId -> InExpr
 1419           -> SpecM SpecInfo    -- New rules, specialised bindings, and usage details
 1420 
 1421 -- This function checks existing rules, and does not create
 1422 -- duplicate ones. So the caller does not need to do this filtering.
 1423 -- See 'already_covered'
 1424 
 1425 type SpecInfo = ( [CoreRule]       -- Specialisation rules
 1426                 , [(Id,CoreExpr)]  -- Specialised definition
 1427                 , UsageDetails )   -- Usage details from specialised RHSs
 1428 
 1429 specCalls spec_imp env existing_rules calls_for_me fn rhs
 1430         -- The first case is the interesting one
 1431   |  notNull calls_for_me               -- And there are some calls to specialise
 1432   && not (isNeverActive (idInlineActivation fn))
 1433         -- Don't specialise NOINLINE things
 1434         -- See Note [Auto-specialisation and RULES]
 1435 
 1436 --   && not (certainlyWillInline (idUnfolding fn))      -- And it's not small
 1437 --      See Note [Inline specialisation] for why we do not
 1438 --      switch off specialisation for inline functions
 1439 
 1440   = -- pprTrace "specDefn: some" (ppr fn $$ ppr calls_for_me $$ ppr existing_rules) $
 1441     foldlM spec_call ([], [], emptyUDs) calls_for_me
 1442 
 1443   | otherwise   -- No calls or RHS doesn't fit our preconceptions
 1444   = warnPprTrace (not (exprIsTrivial rhs) && notNull calls_for_me)
 1445           (text "Missed specialisation opportunity for" <+> ppr fn $$ _trace_doc) $
 1446           -- Note [Specialisation shape]
 1447     -- pprTrace "specDefn: none" (ppr fn <+> ppr calls_for_me) $
 1448     return ([], [], emptyUDs)
 1449   where
 1450     _trace_doc = sep [ ppr rhs_bndrs, ppr (idInlineActivation fn) ]
 1451 
 1452     fn_type   = idType fn
 1453     fn_arity  = idArity fn
 1454     fn_unf    = realIdUnfolding fn  -- Ignore loop-breaker-ness here
 1455     inl_prag  = idInlinePragma fn
 1456     inl_act   = inlinePragmaActivation inl_prag
 1457     is_local  = isLocalId fn
 1458     is_dfun   = isDFunId fn
 1459     dflags    = se_dflags env
 1460     ropts     = initRuleOpts dflags
 1461     this_mod  = se_module env
 1462         -- Figure out whether the function has an INLINE pragma
 1463         -- See Note [Inline specialisations]
 1464 
 1465     (rhs_bndrs, rhs_body) = collectBindersPushingCo rhs
 1466                             -- See Note [Account for casts in binding]
 1467 
 1468     in_scope = Core.substInScope (se_subst env)
 1469 
 1470     already_covered :: RuleOpts -> [CoreRule] -> [CoreExpr] -> Bool
 1471     already_covered ropts new_rules args      -- Note [Specialisations already covered]
 1472        = isJust (lookupRule ropts (in_scope, realIdUnfolding)
 1473                             (const True) fn args
 1474                             (new_rules ++ existing_rules))
 1475          -- NB: we look both in the new_rules (generated by this invocation
 1476          --     of specCalls), and in existing_rules (passed in to specCalls)
 1477 
 1478     ----------------------------------------------------------
 1479         -- Specialise to one particular call pattern
 1480     spec_call :: SpecInfo                         -- Accumulating parameter
 1481               -> CallInfo                         -- Call instance
 1482               -> SpecM SpecInfo
 1483     spec_call spec_acc@(rules_acc, pairs_acc, uds_acc) _ci@(CI { ci_key = call_args })
 1484       = -- See Note [Specialising Calls]
 1485         do { let all_call_args | is_dfun   = call_args ++ repeat UnspecArg
 1486                                | otherwise = call_args
 1487                                -- See Note [Specialising DFuns]
 1488            ; ( useful, rhs_env2, leftover_bndrs
 1489              , rule_bndrs, rule_lhs_args
 1490              , spec_bndrs1, dx_binds, spec_args) <- specHeader env rhs_bndrs all_call_args
 1491 
 1492 --           ; pprTrace "spec_call" (vcat [ text "call info: " <+> ppr _ci
 1493 --                                        , text "useful:    " <+> ppr useful
 1494 --                                        , text "rule_bndrs:" <+> ppr rule_bndrs
 1495 --                                        , text "lhs_args:  " <+> ppr rule_lhs_args
 1496 --                                        , text "spec_bndrs:" <+> ppr spec_bndrs1
 1497 --                                        , text "spec_args: " <+> ppr spec_args
 1498 --                                        , text "dx_binds:  " <+> ppr dx_binds
 1499 --                                        , text "rhs_env2:  " <+> ppr (se_subst rhs_env2)
 1500 --                                        , ppr dx_binds ]) $
 1501 --             return ()
 1502 
 1503            ; if not useful  -- No useful specialisation
 1504                 || already_covered ropts rules_acc rule_lhs_args
 1505              then return spec_acc
 1506              else
 1507         do { -- Run the specialiser on the specialised RHS
 1508              -- The "1" suffix is before we maybe add the void arg
 1509            ; (rhs_body', rhs_uds) <- specExpr rhs_env2 rhs_body
 1510                 -- Add the { d1' = dx1; d2' = dx2 } usage stuff
 1511                 -- to the rhs_uds; see Note [Specialising Calls]
 1512            ; let rhs_uds_w_dx = foldr consDictBind rhs_uds dx_binds
 1513                  spec_rhs_bndrs  = spec_bndrs1 ++ leftover_bndrs
 1514                  (spec_uds, dumped_dbs) = dumpUDs spec_rhs_bndrs rhs_uds_w_dx
 1515                  spec_rhs1 = mkLams spec_rhs_bndrs $
 1516                              wrapDictBindsE dumped_dbs rhs_body'
 1517 
 1518                  spec_fn_ty1 = exprType spec_rhs1
 1519 
 1520                  -- Maybe add a void arg to the specialised function,
 1521                  -- to avoid unlifted bindings
 1522                  -- See Note [Specialisations Must Be Lifted]
 1523                  -- C.f. GHC.Core.Opt.WorkWrap.Utils.mkWorkerArgs
 1524                  add_void_arg = isUnliftedType spec_fn_ty1 && not (isJoinId fn)
 1525                  (spec_bndrs, spec_rhs, spec_fn_ty)
 1526                    | add_void_arg = ( voidPrimId : spec_bndrs1
 1527                                     , Lam        voidArgId  spec_rhs1
 1528                                     , mkVisFunTyMany unboxedUnitTy spec_fn_ty1)
 1529                    | otherwise   = (spec_bndrs1, spec_rhs1, spec_fn_ty1)
 1530 
 1531                  join_arity_decr = length rule_lhs_args - length spec_bndrs
 1532                  spec_join_arity | Just orig_join_arity <- isJoinId_maybe fn
 1533                                  = Just (orig_join_arity - join_arity_decr)
 1534                                  | otherwise
 1535                                  = Nothing
 1536 
 1537            ; spec_fn <- newSpecIdSM fn spec_fn_ty spec_join_arity
 1538            ; let
 1539                 -- The rule to put in the function's specialisation is:
 1540                 --      forall x @b d1' d2'.
 1541                 --          f x @T1 @b @T2 d1' d2' = f1 x @b
 1542                 -- See Note [Specialising Calls]
 1543                 herald | spec_imp  = -- Specialising imported fn
 1544                                      text "SPEC/" <> ppr this_mod
 1545                        | otherwise = -- Specialising local fn
 1546                                      text "SPEC"
 1547 
 1548                 rule_name = mkFastString $ showSDoc dflags $
 1549                             herald <+> ftext (occNameFS (getOccName fn))
 1550                                    <+> hsep (mapMaybe ppr_call_key_ty call_args)
 1551                             -- This name ends up in interface files, so use occNameString.
 1552                             -- Otherwise uniques end up there, making builds
 1553                             -- less deterministic (See #4012 comment:61 ff)
 1554 
 1555                 rule_wout_eta = mkRule
 1556                                   this_mod
 1557                                   True {- Auto generated -}
 1558                                   is_local
 1559                                   rule_name
 1560                                   inl_act       -- Note [Auto-specialisation and RULES]
 1561                                   (idName fn)
 1562                                   rule_bndrs
 1563                                   rule_lhs_args
 1564                                   (mkVarApps (Var spec_fn) spec_bndrs)
 1565 
 1566                 spec_rule
 1567                   = case isJoinId_maybe fn of
 1568                       Just join_arity -> etaExpandToJoinPointRule join_arity rule_wout_eta
 1569                       Nothing -> rule_wout_eta
 1570 
 1571                 simpl_opts = initSimpleOpts dflags
 1572 
 1573                 --------------------------------------
 1574                 -- Add a suitable unfolding if the spec_inl_prag says so
 1575                 -- See Note [Inline specialisations]
 1576                 (spec_inl_prag, spec_unf)
 1577                   | not is_local && isStrongLoopBreaker (idOccInfo fn)
 1578                   = (neverInlinePragma, noUnfolding)
 1579                         -- See Note [Specialising imported functions] in "GHC.Core.Opt.OccurAnal"
 1580 
 1581                   | isInlinablePragma inl_prag
 1582                   = (inl_prag { inl_inline = NoUserInlinePrag }, noUnfolding)
 1583 
 1584                   | otherwise
 1585                   = (inl_prag, specUnfolding simpl_opts spec_bndrs spec_unf_body
 1586                                              rule_lhs_args fn_unf)
 1587 
 1588                 spec_unf_body body = wrapDictBindsE dumped_dbs $
 1589                                      body `mkApps` spec_args
 1590 
 1591                 --------------------------------------
 1592                 -- Adding arity information just propagates it a bit faster
 1593                 --      See Note [Arity decrease] in GHC.Core.Opt.Simplify
 1594                 -- Copy InlinePragma information from the parent Id.
 1595                 -- So if f has INLINE[1] so does spec_fn
 1596                 arity_decr     = count isValArg rule_lhs_args - count isId spec_bndrs
 1597                 spec_f_w_arity = spec_fn `setIdArity`      max 0 (fn_arity - arity_decr)
 1598                                          `setInlinePragma` spec_inl_prag
 1599                                          `setIdUnfolding`  spec_unf
 1600                                          `asJoinId_maybe`  spec_join_arity
 1601 
 1602                 _rule_trace_doc = vcat [ ppr fn <+> dcolon <+> ppr fn_type
 1603                                        , ppr spec_fn  <+> dcolon <+> ppr spec_fn_ty
 1604                                        , ppr rhs_bndrs, ppr call_args
 1605                                        , ppr spec_rule
 1606                                        ]
 1607 
 1608            ; -- pprTrace "spec_call: rule" _rule_trace_doc
 1609              return ( spec_rule                  : rules_acc
 1610                     , (spec_f_w_arity, spec_rhs) : pairs_acc
 1611                     , spec_uds           `plusUDs` uds_acc
 1612                     ) } }
 1613 
 1614 {- Note [Specialising DFuns]
 1615 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 1616 DFuns have a special sort of unfolding (DFunUnfolding), and these are
 1617 hard to specialise a DFunUnfolding to give another DFunUnfolding
 1618 unless the DFun is fully applied (#18120).  So, in the case of DFunIds
 1619 we simply extend the CallKey with trailing UnspecArgs, so we'll
 1620 generate a rule that completely saturates the DFun.
 1621 
 1622 There is an ASSERT that checks this, in the DFunUnfolding case of
 1623 GHC.Core.Unfold.specUnfolding.
 1624 
 1625 Note [Specialisation Must Preserve Sharing]
 1626 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 1627 Consider a function:
 1628 
 1629     f :: forall a. Eq a => a -> blah
 1630     f =
 1631       if expensive
 1632          then f1
 1633          else f2
 1634 
 1635 As written, all calls to 'f' will share 'expensive'. But if we specialise 'f'
 1636 at 'Int', eg:
 1637 
 1638     $sfInt = SUBST[a->Int,dict->dEqInt] (if expensive then f1 else f2)
 1639 
 1640     RULE "SPEC f"
 1641       forall (d :: Eq Int).
 1642         f Int _ = $sfIntf
 1643 
 1644 We've now lost sharing between 'f' and '$sfInt' for 'expensive'. Yikes!
 1645 
 1646 To avoid this, we only generate specialisations for functions whose arity is
 1647 enough to bind all of the arguments we need to specialise.  This ensures our
 1648 specialised functions don't do any work before receiving all of their dicts,
 1649 and thus avoids the 'f' case above.
 1650 
 1651 Note [Specialisations Must Be Lifted]
 1652 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 1653 Consider a function 'f':
 1654 
 1655     f = forall a. Eq a => Array# a
 1656 
 1657 used like
 1658 
 1659     case x of
 1660       True -> ...f @Int dEqInt...
 1661       False -> 0
 1662 
 1663 Naively, we might generate an (expensive) specialisation
 1664 
 1665     $sfInt :: Array# Int
 1666 
 1667 even in the case that @x = False@! Instead, we add a dummy 'Void#' argument to
 1668 the specialisation '$sfInt' ($sfInt :: Void# -> Array# Int) in order to
 1669 preserve laziness.
 1670 
 1671 Note [Specialising Calls]
 1672 ~~~~~~~~~~~~~~~~~~~~~~~~~
 1673 Suppose we have a function with a complicated type:
 1674 
 1675     f :: forall a b c. Int -> Eq a => Show b => c -> Blah
 1676     f @a @b @c i dEqA dShowA x = blah
 1677 
 1678 and suppose it is called at:
 1679 
 1680     f 7 @T1 @T2 @T3 dEqT1 ($dfShow dShowT2) t3
 1681 
 1682 This call is described as a 'CallInfo' whose 'ci_key' is:
 1683 
 1684     [ SpecType T1, SpecType T2, UnspecType, UnspecArg, SpecDict dEqT1
 1685     , SpecDict ($dfShow dShowT2), UnspecArg ]
 1686 
 1687 Why are 'a' and 'b' identified as 'SpecType', while 'c' is 'UnspecType'?
 1688 Because we must specialise the function on type variables that appear
 1689 free in its *dictionary* arguments; but not on type variables that do not
 1690 appear in any dictionaries, i.e. are fully polymorphic.
 1691 
 1692 Because this call has dictionaries applied, we'd like to specialise
 1693 the call on any type argument that appears free in those dictionaries.
 1694 In this case, those are [a :-> T1, b :-> T2].
 1695 
 1696 We also need to substitute the dictionary binders with their
 1697 specialised dictionaries. The simplest substitution would be
 1698 [dEqA :-> dEqT1, dShowA :-> $dfShow dShowT2], but this duplicates
 1699 work, since `$dfShow dShowT2` is a function application. Therefore, we
 1700 also want to *float the dictionary out* (via bindAuxiliaryDict),
 1701 creating a new dict binding
 1702 
 1703     dShow1 = $dfShow dShowT2
 1704 
 1705 and the substitution [dEqA :-> dEqT1, dShowA :-> dShow1].
 1706 
 1707 With the substitutions in hand, we can generate a specialised function:
 1708 
 1709     $sf :: forall c. Int -> c -> Blah
 1710     $sf = SUBST[a :-> T1, b :-> T2, dEqA :-> dEqT1, dShowA :-> dShow1] (\@c i x -> blah)
 1711 
 1712 Note that the substitution is applied to the whole thing.  This is
 1713 convenient, but just slightly fragile.  Notably:
 1714   * There had better be no name clashes in a/b/c
 1715 
 1716 We must construct a rewrite rule:
 1717 
 1718     RULE "SPEC f @T1 @T2 _"
 1719       forall (@c :: Type) (i :: Int) (d1 :: Eq T1) (d2 :: Show T2).
 1720         f @T1 @T2 @c i d1 d2 = $sf @c i
 1721 
 1722 In the rule, d1 and d2 are just wildcards, not used in the RHS.  Note
 1723 additionally that 'x' isn't captured by this rule --- we bind only
 1724 enough etas in order to capture all of the *specialised* arguments.
 1725 
 1726 Note [Drop dead args from specialisations]
 1727 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 1728 When specialising a function, it’s possible some of the arguments may
 1729 actually be dead. For example, consider:
 1730 
 1731     f :: forall a. () -> Show a => a -> String
 1732     f x y = show y ++ "!"
 1733 
 1734 We might generate the following CallInfo for `f @Int`:
 1735 
 1736     [SpecType Int, UnspecArg, SpecDict $dShowInt, UnspecArg]
 1737 
 1738 Normally we’d include both the x and y arguments in the
 1739 specialisation, since we’re not specialising on either of them. But
 1740 that’s silly, since x is actually unused! So we might as well drop it
 1741 in the specialisation:
 1742 
 1743     $sf :: Int -> String
 1744     $sf y = show y ++ "!"
 1745 
 1746     {-# RULE "SPEC f @Int" forall x. f @Int x $dShow = $sf #-}
 1747 
 1748 This doesn’t save us much, since the arg would be removed later by
 1749 worker/wrapper, anyway, but it’s easy to do.
 1750 
 1751 Wrinkles
 1752 
 1753 * Note that we only drop dead arguments if:
 1754     1. We don’t specialise on them.
 1755     2. They come before an argument we do specialise on.
 1756   Doing the latter would require eta-expanding the RULE, which could
 1757   make it match less often, so it’s not worth it. Doing the former could
 1758   be more useful --- it would stop us from generating pointless
 1759   specialisations --- but it’s more involved to implement and unclear if
 1760   it actually provides much benefit in practice.
 1761 
 1762 * If the function has a stable unfolding, specHeader has to come up with
 1763   arguments to pass to that stable unfolding, when building the stable
 1764   unfolding of the specialised function: this is the last field in specHeader's
 1765   big result tuple.
 1766 
 1767   The right thing to do is to produce a LitRubbish; it should rapidly
 1768   disappear.  Rather like GHC.Core.Opt.WorkWrap.Utils.mk_absent_let.
 1769 
 1770 Note [Zap occ info in rule binders]
 1771 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 1772 When we generate a specialisation RULE, we need to drop occurrence
 1773 info on the binders. If we don’t, things go wrong when we specialise a
 1774 function like
 1775 
 1776     f :: forall a. () -> Show a => a -> String
 1777     f x y = show y ++ "!"
 1778 
 1779 since we’ll generate a RULE like
 1780 
 1781     RULE "SPEC f @Int" forall x [Occ=Dead].
 1782       f @Int x $dShow = $sf
 1783 
 1784 and Core Lint complains, even though x only appears on the LHS (due to
 1785 Note [Drop dead args from specialisations]).
 1786 
 1787 Why is that a Lint error? Because the arguments on the LHS of a rule
 1788 are syntactically expressions, not patterns, so Lint treats the
 1789 appearance of x as a use rather than a binding. Fortunately, the
 1790 solution is simple: we just make sure to zap the occ info before
 1791 using ids as wildcard binders in a rule.
 1792 
 1793 Note [Account for casts in binding]
 1794 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 1795 Consider
 1796    f :: Eq a => a -> IO ()
 1797    {-# INLINABLE f
 1798        StableUnf = (/\a \(d:Eq a) (x:a). blah) |> g
 1799      #-}
 1800    f = ...
 1801 
 1802 In f's stable unfolding we have done some modest simplification which
 1803 has pushed the cast to the outside.  (I wonder if this is the Right
 1804 Thing, but it's what happens now; see GHC.Core.Opt.Simplify.Utils Note [Casts and
 1805 lambdas].)  Now that stable unfolding must be specialised, so we want
 1806 to push the cast back inside. It would be terrible if the cast
 1807 defeated specialisation!  Hence the use of collectBindersPushingCo.
 1808 
 1809 Note [Evidence foralls]
 1810 ~~~~~~~~~~~~~~~~~~~~~~~~~~
 1811 Suppose (#12212) that we are specialising
 1812    f :: forall a b. (Num a, F a ~ F b) => blah
 1813 with a=b=Int. Then the RULE will be something like
 1814    RULE forall (d:Num Int) (g :: F Int ~ F Int).
 1815         f Int Int d g = f_spec
 1816 But both varToCoreExpr (when constructing the LHS args), and the
 1817 simplifier (when simplifying the LHS args), will transform to
 1818    RULE forall (d:Num Int) (g :: F Int ~ F Int).
 1819         f Int Int d <F Int> = f_spec
 1820 by replacing g with Refl.  So now 'g' is unbound, which results in a later
 1821 crash. So we use Refl right off the bat, and do not forall-quantify 'g':
 1822  * varToCoreExpr generates a Refl
 1823  * exprsFreeIdsList returns the Ids bound by the args,
 1824    which won't include g
 1825 
 1826 You might wonder if this will match as often, but the simplifier replaces
 1827 complicated Refl coercions with Refl pretty aggressively.
 1828 
 1829 Note [Orphans and auto-generated rules]
 1830 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 1831 When we specialise an INLINABLE function, or when we have
 1832 -fspecialise-aggressively, we auto-generate RULES that are orphans.
 1833 We don't want to warn about these, or we'd generate a lot of warnings.
 1834 Thus, we only warn about user-specified orphan rules.
 1835 
 1836 Indeed, we don't even treat the module as an orphan module if it has
 1837 auto-generated *rule* orphans.  Orphan modules are read every time we
 1838 compile, so they are pretty obtrusive and slow down every compilation,
 1839 even non-optimised ones.  (Reason: for type class instances it's a
 1840 type correctness issue.)  But specialisation rules are strictly for
 1841 *optimisation* only so it's fine not to read the interface.
 1842 
 1843 What this means is that a SPEC rules from auto-specialisation in
 1844 module M will be used in other modules only if M.hi has been read for
 1845 some other reason, which is actually pretty likely.
 1846 
 1847 Note [From non-recursive to recursive]
 1848 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 1849 Even in the non-recursive case, if any dict-binds depend on 'fn' we might
 1850 have built a recursive knot
 1851 
 1852       f a d x = <blah>
 1853       MkUD { ud_binds = NonRec d7  (MkD ..f..)
 1854            , ud_calls = ...(f T d7)... }
 1855 
 1856 The we generate
 1857 
 1858      Rec { fs x = <blah>[T/a, d7/d]
 1859            f a d x = <blah>
 1860                RULE f T _ = fs
 1861            d7 = ...f... }
 1862 
 1863 Here the recursion is only through the RULE.
 1864 
 1865 However we definitely should /not/ make the Rec in this wildly common
 1866 case:
 1867       d = ...
 1868       MkUD { ud_binds = NonRec d7 (...d...)
 1869            , ud_calls = ...(f T d7)... }
 1870 
 1871 Here we want simply to add d to the floats, giving
 1872       MkUD { ud_binds = NonRec d (...)
 1873                         NonRec d7 (...d...)
 1874            , ud_calls = ...(f T d7)... }
 1875 
 1876 In general, we need only make this Rec if
 1877   - there are some specialisations (spec_binds non-empty)
 1878   - there are some dict_binds that depend on f (dump_dbs non-empty)
 1879 
 1880 Note [Avoiding loops (DFuns)]
 1881 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 1882 When specialising /dictionary functions/ we must be very careful to
 1883 avoid building loops. Here is an example that bit us badly, on
 1884 several distinct occasions.
 1885 
 1886 Here is one: #3591
 1887      class Eq a => C a
 1888      instance Eq [a] => C [a]
 1889 
 1890 This translates to
 1891      dfun :: Eq [a] -> C [a]
 1892      dfun a d = MkD a d (meth d)
 1893 
 1894      d4 :: Eq [T] = <blah>
 1895      d2 ::  C [T] = dfun T d4
 1896      d1 :: Eq [T] = $p1 d2
 1897      d3 ::  C [T] = dfun T d1
 1898 
 1899 None of these definitions is recursive. What happened was that we
 1900 generated a specialisation:
 1901      RULE forall d. dfun T d = dT  :: C [T]
 1902      dT = (MkD a d (meth d)) [T/a, d1/d]
 1903         = MkD T d1 (meth d1)
 1904 
 1905 But now we use the RULE on the RHS of d2, to get
 1906     d2 = dT = MkD d1 (meth d1)
 1907     d1 = $p1 d2
 1908 
 1909 and now d1 is bottom!  The problem is that when specialising 'dfun' we
 1910 should first dump "below" the binding all floated dictionary bindings
 1911 that mention 'dfun' itself.  So d2 and d3 (and hence d1) must be
 1912 placed below 'dfun', and thus unavailable to it when specialising
 1913 'dfun'.  That in turn means that the call (dfun T d1) must be
 1914 discarded.  On the other hand, the call (dfun T d4) is fine, assuming
 1915 d4 doesn't mention dfun.
 1916 
 1917 Solution:
 1918   Discard all calls that mention dictionaries that depend
 1919   (directly or indirectly) on the dfun we are specialising.
 1920   This is done by 'filterCalls'
 1921 
 1922 Note [Avoiding loops (non-DFuns)]
 1923 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 1924 The whole Note [Avoiding loops (DFuns)] things applies only to DFuns.
 1925 It's important /not/ to apply filterCalls to non-DFuns. For example:
 1926 
 1927   class C a where { foo,bar :: [a] -> [a] }
 1928 
 1929   instance C Int where
 1930      foo x = r_bar x
 1931      bar xs = reverse xs
 1932 
 1933   r_bar :: C a => [a] -> [a]
 1934   r_bar xs = bar (xs ++ xs)
 1935 
 1936 That translates to:
 1937 
 1938     r_bar a (c::C a) (xs::[a]) = bar a d (xs ++ xs)
 1939 
 1940     Rec { $fCInt :: C Int = MkC foo_help reverse
 1941           foo_help (xs::[Int]) = r_bar Int $fCInt xs }
 1942 
 1943 The call (r_bar $fCInt) mentions $fCInt,
 1944                         which mentions foo_help,
 1945                         which mentions r_bar
 1946 
 1947 But we DO want to specialise r_bar at Int:
 1948     Rec { $fCInt :: C Int = MkC foo_help reverse
 1949           foo_help (xs::[Int]) = r_bar Int $fCInt xs
 1950 
 1951           r_bar a (c::C a) (xs::[a]) = bar a d (xs ++ xs)
 1952             RULE r_bar Int _ = r_bar_Int
 1953 
 1954           r_bar_Int xs = bar Int $fCInt (xs ++ xs)
 1955            }
 1956 
 1957 Note that, because of its RULE, r_bar joins the recursive
 1958 group.  (In this case it'll unravel a short moment later.)
 1959 See test simplCore/should_compile/T19599a.
 1960 
 1961 Another example is #19599, which looked like this:
 1962 
 1963    class (Show a, Enum a) => MyShow a where
 1964       myShow :: a -> String
 1965 
 1966    myShow_impl :: MyShow a => a -> String
 1967 
 1968    foo :: Int -> String
 1969    foo = myShow_impl @Int $fMyShowInt
 1970 
 1971    Rec { $fMyShowInt = MkMyShowD $fEnumInt $fShowInt $cmyShow
 1972        ; $cmyShow = myShow_impl @Int $fMyShowInt }
 1973 
 1974 Here, we really do want to specialise `myShow_impl @Int $fMyShowInt`.
 1975 
 1976 
 1977 Note [Specialising a recursive group]
 1978 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 1979 Consider
 1980     let rec { f x = ...g x'...
 1981             ; g y = ...f y'.... }
 1982     in f 'a'
 1983 Here we specialise 'f' at Char; but that is very likely to lead to
 1984 a specialisation of 'g' at Char.  We must do the latter, else the
 1985 whole point of specialisation is lost.
 1986 
 1987 But we do not want to keep iterating to a fixpoint, because in the
 1988 presence of polymorphic recursion we might generate an infinite number
 1989 of specialisations.
 1990 
 1991 So we use the following heuristic:
 1992   * Arrange the rec block in dependency order, so far as possible
 1993     (the occurrence analyser already does this)
 1994 
 1995   * Specialise it much like a sequence of lets
 1996 
 1997   * Then go through the block a second time, feeding call-info from
 1998     the RHSs back in the bottom, as it were
 1999 
 2000 In effect, the ordering maxmimises the effectiveness of each sweep,
 2001 and we do just two sweeps.   This should catch almost every case of
 2002 monomorphic recursion -- the exception could be a very knotted-up
 2003 recursion with multiple cycles tied up together.
 2004 
 2005 This plan is implemented in the Rec case of specBindItself.
 2006 
 2007 Note [Specialisations already covered]
 2008 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 2009 We obviously don't want to generate two specialisations for the same
 2010 argument pattern.  There are two wrinkles
 2011 
 2012 1. We do the already-covered test in specDefn, not when we generate
 2013 the CallInfo in mkCallUDs.  We used to test in the latter place, but
 2014 we now iterate the specialiser somewhat, and the Id at the call site
 2015 might therefore not have all the RULES that we can see in specDefn
 2016 
 2017 2. What about two specialisations where the second is an *instance*
 2018 of the first?  If the more specific one shows up first, we'll generate
 2019 specialisations for both.  If the *less* specific one shows up first,
 2020 we *don't* currently generate a specialisation for the more specific
 2021 one.  (See the call to lookupRule in already_covered.)  Reasons:
 2022   (a) lookupRule doesn't say which matches are exact (bad reason)
 2023   (b) if the earlier specialisation is user-provided, it's
 2024       far from clear that we should auto-specialise further
 2025 
 2026 Note [Auto-specialisation and RULES]
 2027 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 2028 Consider:
 2029    g :: Num a => a -> a
 2030    g = ...
 2031 
 2032    f :: (Int -> Int) -> Int
 2033    f w = ...
 2034    {-# RULE f g = 0 #-}
 2035 
 2036 Suppose that auto-specialisation makes a specialised version of
 2037 g::Int->Int That version won't appear in the LHS of the RULE for f.
 2038 So if the specialisation rule fires too early, the rule for f may
 2039 never fire.
 2040 
 2041 It might be possible to add new rules, to "complete" the rewrite system.
 2042 Thus when adding
 2043         RULE forall d. g Int d = g_spec
 2044 also add
 2045         RULE f g_spec = 0
 2046 
 2047 But that's a bit complicated.  For now we ask the programmer's help,
 2048 by *copying the INLINE activation pragma* to the auto-specialised
 2049 rule.  So if g says {-# NOINLINE[2] g #-}, then the auto-spec rule
 2050 will also not be active until phase 2.  And that's what programmers
 2051 should jolly well do anyway, even aside from specialisation, to ensure
 2052 that g doesn't inline too early.
 2053 
 2054 This in turn means that the RULE would never fire for a NOINLINE
 2055 thing so not much point in generating a specialisation at all.
 2056 
 2057 Note [Specialisation shape]
 2058 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
 2059 We only specialise a function if it has visible top-level lambdas
 2060 corresponding to its overloading.  E.g. if
 2061         f :: forall a. Eq a => ....
 2062 then its body must look like
 2063         f = /\a. \d. ...
 2064 
 2065 Reason: when specialising the body for a call (f ty dexp), we want to
 2066 substitute dexp for d, and pick up specialised calls in the body of f.
 2067 
 2068 This doesn't always work.  One example I came across was this:
 2069         newtype Gen a = MkGen{ unGen :: Int -> a }
 2070 
 2071         choose :: Eq a => a -> Gen a
 2072         choose n = MkGen (\r -> n)
 2073 
 2074         oneof = choose (1::Int)
 2075 
 2076 It's a silly example, but we get
 2077         choose = /\a. g `cast` co
 2078 where choose doesn't have any dict arguments.  Thus far I have not
 2079 tried to fix this (wait till there's a real example).
 2080 
 2081 Mind you, then 'choose' will be inlined (since RHS is trivial) so
 2082 it doesn't matter.  This comes up with single-method classes
 2083 
 2084    class C a where { op :: a -> a }
 2085    instance C a => C [a] where ....
 2086 ==>
 2087    $fCList :: C a => C [a]
 2088    $fCList = $copList |> (...coercion>...)
 2089    ....(uses of $fCList at particular types)...
 2090 
 2091 So we suppress the WARN if the rhs is trivial.
 2092 
 2093 Note [Inline specialisations]
 2094 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 2095 Here is what we do with the InlinePragma of the original function
 2096   * Activation/RuleMatchInfo: both transferred to the
 2097                               specialised function
 2098   * InlineSpec:
 2099        (a) An INLINE pragma is transferred
 2100        (b) An INLINABLE pragma is *not* transferred
 2101 
 2102 Why (a): transfer INLINE pragmas? The point of INLINE was precisely to
 2103 specialise the function at its call site, and arguably that's not so
 2104 important for the specialised copies.  BUT *pragma-directed*
 2105 specialisation now takes place in the typechecker/desugarer, with
 2106 manually specified INLINEs.  The specialisation here is automatic.
 2107 It'd be very odd if a function marked INLINE was specialised (because
 2108 of some local use), and then forever after (including importing
 2109 modules) the specialised version wasn't INLINEd.  After all, the
 2110 programmer said INLINE!
 2111 
 2112 You might wonder why we specialise INLINE functions at all.  After
 2113 all they should be inlined, right?  Two reasons:
 2114 
 2115  * Even INLINE functions are sometimes not inlined, when they aren't
 2116    applied to interesting arguments.  But perhaps the type arguments
 2117    alone are enough to specialise (even though the args are too boring
 2118    to trigger inlining), and it's certainly better to call the
 2119    specialised version.
 2120 
 2121  * The RHS of an INLINE function might call another overloaded function,
 2122    and we'd like to generate a specialised version of that function too.
 2123    This actually happens a lot. Consider
 2124       replicateM_ :: (Monad m) => Int -> m a -> m ()
 2125       {-# INLINABLE replicateM_ #-}
 2126       replicateM_ d x ma = ...
 2127    The strictness analyser may transform to
 2128       replicateM_ :: (Monad m) => Int -> m a -> m ()
 2129       {-# INLINE replicateM_ #-}
 2130       replicateM_ d x ma = case x of I# x' -> $wreplicateM_ d x' ma
 2131 
 2132       $wreplicateM_ :: (Monad m) => Int# -> m a -> m ()
 2133       {-# INLINABLE $wreplicateM_ #-}
 2134       $wreplicateM_ = ...
 2135    Now an importing module has a specialised call to replicateM_, say
 2136    (replicateM_ dMonadIO).  We certainly want to specialise $wreplicateM_!
 2137    This particular example had a huge effect on the call to replicateM_
 2138    in nofib/shootout/n-body.
 2139 
 2140 Why (b): discard INLINABLE pragmas? See #4874 for persuasive examples.
 2141 Suppose we have
 2142     {-# INLINABLE f #-}
 2143     f :: Ord a => [a] -> Int
 2144     f xs = letrec f' = ...f'... in f'
 2145 Then, when f is specialised and optimised we might get
 2146     wgo :: [Int] -> Int#
 2147     wgo = ...wgo...
 2148     f_spec :: [Int] -> Int
 2149     f_spec xs = case wgo xs of { r -> I# r }
 2150 and we clearly want to inline f_spec at call sites.  But if we still
 2151 have the big, un-optimised of f (albeit specialised) captured in an
 2152 INLINABLE pragma for f_spec, we won't get that optimisation.
 2153 
 2154 So we simply drop INLINABLE pragmas when specialising. It's not really
 2155 a complete solution; ignoring specialisation for now, INLINABLE functions
 2156 don't get properly strictness analysed, for example. But it works well
 2157 for examples involving specialisation, which is the dominant use of
 2158 INLINABLE.  See #4874.
 2159 -}
 2160 
 2161 {- *********************************************************************
 2162 *                                                                      *
 2163                    SpecArg, and specHeader
 2164 *                                                                      *
 2165 ********************************************************************* -}
 2166 
 2167 -- | An argument that we might want to specialise.
 2168 -- See Note [Specialising Calls] for the nitty gritty details.
 2169 data SpecArg
 2170   =
 2171     -- | Type arguments that should be specialised, due to appearing
 2172     -- free in the type of a 'SpecDict'.
 2173     SpecType Type
 2174 
 2175     -- | Type arguments that should remain polymorphic.
 2176   | UnspecType
 2177 
 2178     -- | Dictionaries that should be specialised. mkCallUDs ensures
 2179     -- that only "interesting" dictionary arguments get a SpecDict;
 2180     -- see Note [Interesting dictionary arguments]
 2181   | SpecDict DictExpr
 2182 
 2183     -- | Value arguments that should not be specialised.
 2184   | UnspecArg
 2185 
 2186 instance Outputable SpecArg where
 2187   ppr (SpecType t) = text "SpecType" <+> ppr t
 2188   ppr UnspecType   = text "UnspecType"
 2189   ppr (SpecDict d) = text "SpecDict" <+> ppr d
 2190   ppr UnspecArg    = text "UnspecArg"
 2191 
 2192 specArgFreeIds :: SpecArg -> IdSet
 2193 specArgFreeIds (SpecType {}) = emptyVarSet
 2194 specArgFreeIds (SpecDict dx) = exprFreeIds dx
 2195 specArgFreeIds UnspecType    = emptyVarSet
 2196 specArgFreeIds UnspecArg     = emptyVarSet
 2197 
 2198 isSpecDict :: SpecArg -> Bool
 2199 isSpecDict (SpecDict {}) = True
 2200 isSpecDict _             = False
 2201 
 2202 -- | Given binders from an original function 'f', and the 'SpecArg's
 2203 -- corresponding to its usage, compute everything necessary to build
 2204 -- a specialisation.
 2205 --
 2206 -- We will use the running example from Note [Specialising Calls]:
 2207 --
 2208 --     f :: forall a b c. Int -> Eq a => Show b => c -> Blah
 2209 --     f @a @b @c i dEqA dShowA x = blah
 2210 --
 2211 -- Suppose we decide to specialise it at the following pattern:
 2212 --
 2213 --     [ SpecType T1, SpecType T2, UnspecType, UnspecArg
 2214 --     , SpecDict dEqT1, SpecDict ($dfShow dShowT2), UnspecArg ]
 2215 --
 2216 -- We'd eventually like to build the RULE
 2217 --
 2218 --     RULE "SPEC f @T1 @T2 _"
 2219 --       forall (@c :: Type) (i :: Int) (d1 :: Eq T1) (d2 :: Show T2).
 2220 --         f @T1 @T2 @c i d1 d2 = $sf @c i
 2221 --
 2222 -- and the specialisation '$sf'
 2223 --
 2224 --     $sf :: forall c. Int -> c -> Blah
 2225 --     $sf = SUBST[a :-> T1, b :-> T2, dEqA :-> dEqT1, dShowA :-> dShow1] (\@c i x -> blah)
 2226 --
 2227 -- where dShow1 is a floated binding created by bindAuxiliaryDict.
 2228 --
 2229 -- The cases for 'specHeader' below are presented in the same order as this
 2230 -- running example. The result of 'specHeader' for this example is as follows:
 2231 --
 2232 --    ( -- Returned arguments
 2233 --      env + [a :-> T1, b :-> T2, dEqA :-> dEqT1, dShowA :-> dShow1]
 2234 --    , [x]
 2235 --
 2236 --      -- RULE helpers
 2237 --    , [c, i, d1, d2]
 2238 --    , [T1, T2, c, i, d1, d2]
 2239 --
 2240 --      -- Specialised function helpers
 2241 --    , [c, i, x]
 2242 --    , [dShow1 = $dfShow dShowT2]
 2243 --    , [T1, T2, c, i, dEqT1, dShow1]
 2244 --    )
 2245 specHeader
 2246      :: SpecEnv
 2247      -> [InBndr]    -- The binders from the original function 'f'
 2248      -> [SpecArg]   -- From the CallInfo
 2249      -> SpecM ( Bool     -- True <=> some useful specialisation happened
 2250                          -- Not the same as any (isSpecDict args) because
 2251                          -- the args might be longer than bndrs
 2252 
 2253                 -- Returned arguments
 2254               , SpecEnv      -- Substitution to apply to the body of 'f'
 2255               , [OutBndr]    -- Leftover binders from the original function 'f'
 2256                              --   that don’t have a corresponding SpecArg
 2257 
 2258                 -- RULE helpers
 2259               , [OutBndr]    -- Binders for the RULE
 2260               , [OutExpr]    -- Args for the LHS of the rule
 2261 
 2262                 -- Specialised function helpers
 2263               , [OutBndr]    -- Binders for $sf
 2264               , [DictBind]   -- Auxiliary dictionary bindings
 2265               , [OutExpr]    -- Specialised arguments for unfolding
 2266                              -- Same length as "Args for LHS of rule"
 2267               )
 2268 
 2269 -- We want to specialise on type 'T1', and so we must construct a substitution
 2270 -- 'a->T1', as well as a LHS argument for the resulting RULE and unfolding
 2271 -- details.
 2272 specHeader env (bndr : bndrs) (SpecType ty : args)
 2273   = do { let in_scope = Core.substInScope (se_subst env)
 2274              qvars    = scopedSort $
 2275                         filterOut (`elemInScopeSet` in_scope) $
 2276                         tyCoVarsOfTypeList ty
 2277              (env1, qvars') = substBndrs env qvars
 2278              ty'            = substTy env1 ty
 2279              env2           = extendTvSubstList env1 [(bndr, ty')]
 2280        ; (useful, env3, leftover_bndrs, rule_bs, rule_es, bs', dx, spec_args)
 2281             <- specHeader env2 bndrs args
 2282        ; pure ( useful
 2283               , env3
 2284               , leftover_bndrs
 2285               , qvars' ++ rule_bs
 2286               , Type ty' : rule_es
 2287               , qvars' ++ bs'
 2288               , dx
 2289               , Type ty' : spec_args
 2290               )
 2291        }
 2292 
 2293 -- Next we have a type that we don't want to specialise. We need to perform
 2294 -- a substitution on it (in case the type refers to 'a'). Additionally, we need
 2295 -- to produce a binder, LHS argument and RHS argument for the resulting rule,
 2296 -- /and/ a binder for the specialised body.
 2297 specHeader env (bndr : bndrs) (UnspecType : args)
 2298   = do { let (env', bndr') = substBndr env bndr
 2299        ; (useful, env'', leftover_bndrs, rule_bs, rule_es, bs', dx, spec_args)
 2300             <- specHeader env' bndrs args
 2301        ; pure ( useful
 2302               , env''
 2303               , leftover_bndrs
 2304               , bndr' : rule_bs
 2305               , varToCoreExpr bndr' : rule_es
 2306               , bndr' : bs'
 2307               , dx
 2308               , varToCoreExpr bndr' : spec_args
 2309               )
 2310        }
 2311 
 2312 -- Next we want to specialise the 'Eq a' dict away. We need to construct
 2313 -- a wildcard binder to match the dictionary (See Note [Specialising Calls] for
 2314 -- the nitty-gritty), as a LHS rule and unfolding details.
 2315 specHeader env (bndr : bndrs) (SpecDict d : args)
 2316   = do { bndr' <- newDictBndr env bndr -- See Note [Zap occ info in rule binders]
 2317        ; let (env', dx_bind, spec_dict) = bindAuxiliaryDict env bndr bndr' d
 2318        ; (_, env'', leftover_bndrs, rule_bs, rule_es, bs', dx, spec_args)
 2319              <- specHeader env' bndrs args
 2320        ; pure ( True      -- Ha!  A useful specialisation!
 2321               , env''
 2322               , leftover_bndrs
 2323               -- See Note [Evidence foralls]
 2324               , exprFreeIdsList (varToCoreExpr bndr') ++ rule_bs
 2325               , varToCoreExpr bndr' : rule_es
 2326               , bs'
 2327               , maybeToList dx_bind ++ dx
 2328               , spec_dict : spec_args
 2329               )
 2330        }
 2331 
 2332 -- Finally, we have the unspecialised argument 'i'. We need to produce
 2333 -- a binder, LHS and RHS argument for the RULE, and a binder for the
 2334 -- specialised body.
 2335 --
 2336 -- NB: Calls to 'specHeader' will trim off any trailing 'UnspecArg's, which is
 2337 -- why 'i' doesn't appear in our RULE above. But we have no guarantee that
 2338 -- there aren't 'UnspecArg's which come /before/ all of the dictionaries, so
 2339 -- this case must be here.
 2340 specHeader env (bndr : bndrs) (UnspecArg : args)
 2341   = do { -- see Note [Zap occ info in rule binders]
 2342          let (env', bndr') = substBndr env (zapIdOccInfo bndr)
 2343        ; (useful, env'', leftover_bndrs, rule_bs, rule_es, bs', dx, spec_args)
 2344              <- specHeader env' bndrs args
 2345 
 2346        ; let bndr_ty = idType bndr'
 2347 
 2348              -- See Note [Drop dead args from specialisations]
 2349              -- C.f. GHC.Core.Opt.WorkWrap.Utils.mk_absent_let
 2350              (mb_spec_bndr, spec_arg)
 2351                 | isDeadBinder bndr
 2352                 , Just lit_expr <- mkLitRubbish bndr_ty
 2353                 = (Nothing, lit_expr)
 2354                 | otherwise
 2355                 = (Just bndr', varToCoreExpr bndr')
 2356 
 2357        ; pure ( useful
 2358               , env''
 2359               , leftover_bndrs
 2360               , bndr' : rule_bs
 2361               , varToCoreExpr bndr' : rule_es
 2362               , case mb_spec_bndr of
 2363                   Just b' -> b' : bs'
 2364                   Nothing -> bs'
 2365               , dx
 2366               , spec_arg : spec_args
 2367               )
 2368        }
 2369 
 2370 -- If we run out of binders, stop immediately
 2371 -- See Note [Specialisation Must Preserve Sharing]
 2372 specHeader env [] _ = pure (False, env, [], [], [], [], [], [])
 2373 
 2374 -- Return all remaining binders from the original function. These have the
 2375 -- invariant that they should all correspond to unspecialised arguments, so
 2376 -- it's safe to stop processing at this point.
 2377 specHeader env bndrs []
 2378   = pure (False, env', bndrs', [], [], [], [], [])
 2379   where
 2380     (env', bndrs') = substBndrs env bndrs
 2381 
 2382 
 2383 -- | Binds a dictionary argument to a fresh name, to preserve sharing
 2384 bindAuxiliaryDict
 2385   :: SpecEnv
 2386   -> InId -> OutId -> OutExpr -- Original dict binder, and the witnessing expression
 2387   -> ( SpecEnv        -- Substitute for orig_dict_id
 2388      , Maybe DictBind -- Auxiliary dict binding, if any
 2389      , OutExpr)        -- Witnessing expression (always trivial)
 2390 bindAuxiliaryDict env@(SE { se_subst = subst, se_interesting = interesting })
 2391                   orig_dict_id fresh_dict_id dict_expr
 2392 
 2393   -- If the dictionary argument is trivial,
 2394   -- don’t bother creating a new dict binding; just substitute
 2395   | Just dict_id <- getIdFromTrivialExpr_maybe dict_expr
 2396   = let env' = env { se_subst = Core.extendSubst subst orig_dict_id dict_expr
 2397                                 `Core.extendInScope` dict_id
 2398                           -- See Note [Keep the old dictionaries interesting]
 2399                    , se_interesting = interesting `extendVarSet` dict_id }
 2400     in (env', Nothing, dict_expr)
 2401 
 2402   | otherwise  -- Non-trivial dictionary arg; make an auxiliary binding
 2403   = let dict_bind = mkDB (NonRec fresh_dict_id dict_expr)
 2404         env' = env { se_subst = Core.extendSubst subst orig_dict_id (Var fresh_dict_id)
 2405                                 `Core.extendInScope` fresh_dict_id
 2406                       -- See Note [Make the new dictionaries interesting]
 2407                    , se_interesting = interesting `extendVarSet` fresh_dict_id }
 2408     in (env', Just dict_bind, Var fresh_dict_id)
 2409 
 2410 {-
 2411 Note [Make the new dictionaries interesting]
 2412 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 2413 Important!  We're going to substitute dx_id1 for d
 2414 and we want it to look "interesting", else we won't gather *any*
 2415 consequential calls. E.g.
 2416     f d = ...g d....
 2417 If we specialise f for a call (f (dfun dNumInt)), we'll get
 2418 a consequent call (g d') with an auxiliary definition
 2419     d' = df dNumInt
 2420 We want that consequent call to look interesting
 2421 
 2422 Note [Keep the old dictionaries interesting]
 2423 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 2424 In bindAuxiliaryDict, we don’t bother creating a new dict binding if
 2425 the dict expression is trivial. For example, if we have
 2426 
 2427     f = \ @m1 (d1 :: Monad m1) -> ...
 2428 
 2429 and we specialize it at the pattern
 2430 
 2431     [SpecType IO, SpecArg $dMonadIO]
 2432 
 2433 it would be silly to create a new binding for $dMonadIO; it’s already
 2434 a binding! So we just extend the substitution directly:
 2435 
 2436     m1 :-> IO
 2437     d1 :-> $dMonadIO
 2438 
 2439 But this creates a new subtlety: the dict expression might be a dict
 2440 binding we floated out while specializing another function. For
 2441 example, we might have
 2442 
 2443     d2 = $p1Monad $dMonadIO -- floated out by bindAuxiliaryDict
 2444     $sg = h @IO d2
 2445     h = \ @m2 (d2 :: Applicative m2) -> ...
 2446 
 2447 and end up specializing h at the following pattern:
 2448 
 2449     [SpecType IO, SpecArg d2]
 2450 
 2451 When we created the d2 binding in the first place, we locally marked
 2452 it as interesting while specializing g as described above by
 2453 Note [Make the new dictionaries interesting]. But when we go to
 2454 specialize h, it isn’t in the SpecEnv anymore, so we’ve lost the
 2455 knowledge that we should specialize on it.
 2456 
 2457 To fix this, we have to explicitly add d2 *back* to the interesting
 2458 set. That way, it will still be considered interesting while
 2459 specializing the body of h. See !2913.
 2460 -}
 2461 
 2462 
 2463 {- *********************************************************************
 2464 *                                                                      *
 2465             UsageDetails and suchlike
 2466 *                                                                      *
 2467 ********************************************************************* -}
 2468 
 2469 data UsageDetails
 2470   = MkUD {
 2471       ud_binds :: !(Bag DictBind),
 2472                -- See Note [Floated dictionary bindings]
 2473                -- The order is important;
 2474                -- in ds1 `union` ds2, bindings in ds2 can depend on those in ds1
 2475                -- (Remember, Bags preserve order in GHC.)
 2476 
 2477       ud_calls :: !CallDetails
 2478 
 2479       -- INVARIANT: suppose bs = bindersOf ud_binds
 2480       -- Then 'calls' may *mention* 'bs',
 2481       -- but there should be no calls *for* bs
 2482     }
 2483 
 2484 -- | A 'DictBind' is a binding along with a cached set containing its free
 2485 -- variables (both type variables and dictionaries)
 2486 data DictBind = DB { db_bind :: CoreBind, db_fvs :: VarSet }
 2487 
 2488 {- Note [Floated dictionary bindings]
 2489 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 2490 We float out dictionary bindings for the reasons described under
 2491 "Dictionary floating" above.  But not /just/ dictionary bindings.
 2492 Consider
 2493 
 2494    f :: Eq a => blah
 2495    f a d = rhs
 2496 
 2497    $c== :: T -> T -> Bool
 2498    $c== x y = ...
 2499 
 2500    $df :: Eq T
 2501    $df = Eq $c== ...
 2502 
 2503    gurgle = ...(f @T $df)...
 2504 
 2505 We gather the call info for (f @T $df), and we don't want to drop it
 2506 when we come across the binding for $df.  So we add $df to the floats
 2507 and continue.  But then we have to add $c== to the floats, and so on.
 2508 These all float above the binding for 'f', and now we can
 2509 successfully specialise 'f'.
 2510 
 2511 So the DictBinds in (ud_binds :: Bag DictBind) may contain
 2512 non-dictionary bindings too.
 2513 
 2514 Note [Specialising polymorphic dictionaries]
 2515 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 2516 Consider
 2517     class M a where { foo :: a -> Int }
 2518 
 2519     instance M (ST s) where ...
 2520     -- dMST :: forall s. M (ST s)
 2521 
 2522     wimwam :: forall a. M a => a -> Int
 2523     wimwam = /\a \(d::M a). body
 2524 
 2525     f :: ST s -> Int
 2526     f = /\s \(x::ST s). wimwam @(ST s) (dMST @s) dx + 1
 2527 
 2528 We'd like to specialise wimwam at (ST s), thus
 2529     $swimwam :: forall s. ST s -> Int
 2530     $swimwam = /\s. body[ST s/a, (dMST @s)/d]
 2531 
 2532     RULE forall s (d :: M (ST s)).
 2533          wimwam @(ST s) d = $swimwam @s
 2534 
 2535 Here are the moving parts:
 2536 
 2537 * We must /not/ dump the CallInfo
 2538     CIS wimwam (CI { ci_key = [@(ST s), dMST @s]
 2539                    , ci_fvs = {dMST} })
 2540   when we come to the /\s.  Instead, we simply let it continue to float
 2541   upwards. Hence ci_fvs is an IdSet, listing the /Ids/ that
 2542   are free in the call, but not the /TyVars/.  Hence using specArgFreeIds
 2543   in singleCall.
 2544 
 2545   NB to be fully kosher we should explicitly quantifying the CallInfo
 2546   over 's', but we don't bother.  This would matter if there was an
 2547   enclosing binding of the same 's', which I don't expect to happen.
 2548 
 2549 * Whe we come to specialise the call, we must remember to quantify
 2550   over 's'.  That is done in the SpecType case of specHeader, where
 2551   we add 's' (called qvars) to the binders of the RULE and the specialised
 2552   function.
 2553 
 2554 * If we have f :: forall m. Monoid m => blah, and two calls
 2555      (f @(Endo b)      (d :: Monoid (Endo b))
 2556      (f @(Endo (c->c)) (d :: Monoid (Endo (c->c)))
 2557   we want to generate a specialisation only for the first.  The second
 2558   is just a substitution instance of the first, with no greater specialisation.
 2559   Hence the call to `remove_dups` in `filterCalls`.
 2560 
 2561 All this arose in #13873, in the unexpected form that a SPECIALISE
 2562 pragma made the program slower!  The reason was that the specialised
 2563 function $sinsertWith arising from the pragma looked rather like `f`
 2564 above, and failed to specialise a call in its body like wimwam.
 2565 Without the pragma, the original call to `insertWith` was completely
 2566 monomorphic, and specialised in one go.
 2567 -}
 2568 
 2569 instance Outputable DictBind where
 2570   ppr (DB { db_bind = bind, db_fvs = fvs })
 2571     = text "DB" <+> braces (sep [ text "bind:" <+> ppr bind
 2572                                 , text "fvs: " <+> ppr fvs ])
 2573 
 2574 instance Outputable UsageDetails where
 2575   ppr (MkUD { ud_binds = dbs, ud_calls = calls })
 2576         = text "MkUD" <+> braces (sep (punctuate comma
 2577                 [text "binds" <+> equals <+> ppr dbs,
 2578                  text "calls" <+> equals <+> ppr calls]))
 2579 
 2580 emptyUDs :: UsageDetails
 2581 emptyUDs = MkUD { ud_binds = emptyBag, ud_calls = emptyDVarEnv }
 2582 
 2583 ------------------------------------------------------------
 2584 type CallDetails  = DIdEnv CallInfoSet
 2585   -- The order of specialized binds and rules depends on how we linearize
 2586   -- CallDetails, so to get determinism we must use a deterministic set here.
 2587   -- See Note [Deterministic UniqFM] in GHC.Types.Unique.DFM
 2588 
 2589 data CallInfoSet = CIS Id (Bag CallInfo)
 2590   -- The list of types and dictionaries is guaranteed to
 2591   -- match the type of f
 2592   -- The Bag may contain duplicate calls (i.e. f @T and another f @T)
 2593   -- These dups are eliminated by already_covered in specCalls
 2594 
 2595 data CallInfo
 2596   = CI { ci_key  :: [SpecArg]   -- All arguments
 2597        , ci_fvs  :: IdSet       -- Free Ids of the ci_key call
 2598                                 -- _not_ including the main id itself, of course
 2599                                 -- NB: excluding tyvars:
 2600                                 --     See Note [Specialising polymorphic dictionaries]
 2601     }
 2602 
 2603 type DictExpr = CoreExpr
 2604 
 2605 ciSetFilter :: (CallInfo -> Bool) -> CallInfoSet -> CallInfoSet
 2606 ciSetFilter p (CIS id a) = CIS id (filterBag p a)
 2607 
 2608 instance Outputable CallInfoSet where
 2609   ppr (CIS fn map) = hang (text "CIS" <+> ppr fn)
 2610                         2 (ppr map)
 2611 
 2612 pprCallInfo :: Id -> CallInfo -> SDoc
 2613 pprCallInfo fn (CI { ci_key = key })
 2614   = ppr fn <+> ppr key
 2615 
 2616 ppr_call_key_ty :: SpecArg -> Maybe SDoc
 2617 ppr_call_key_ty (SpecType ty) = Just $ char '@' <> pprParendType ty
 2618 ppr_call_key_ty UnspecType    = Just $ char '_'
 2619 ppr_call_key_ty (SpecDict _)  = Nothing
 2620 ppr_call_key_ty UnspecArg     = Nothing
 2621 
 2622 instance Outputable CallInfo where
 2623   ppr (CI { ci_key = key, ci_fvs = _fvs })
 2624     = text "CI" <> braces (sep (map ppr key))
 2625 
 2626 unionCalls :: CallDetails -> CallDetails -> CallDetails
 2627 unionCalls c1 c2 = plusDVarEnv_C unionCallInfoSet c1 c2
 2628 
 2629 unionCallInfoSet :: CallInfoSet -> CallInfoSet -> CallInfoSet
 2630 unionCallInfoSet (CIS f calls1) (CIS _ calls2) =
 2631   CIS f (calls1 `unionBags` calls2)
 2632 
 2633 callDetailsFVs :: CallDetails -> VarSet
 2634 callDetailsFVs calls =
 2635   nonDetStrictFoldUDFM (unionVarSet . callInfoFVs) emptyVarSet calls
 2636   -- It's OK to use nonDetStrictFoldUDFM here because we forget the ordering
 2637   -- immediately by converting to a nondeterministic set.
 2638 
 2639 callInfoFVs :: CallInfoSet -> VarSet
 2640 callInfoFVs (CIS _ call_info) =
 2641   foldr (\(CI { ci_fvs = fv }) vs -> unionVarSet fv vs) emptyVarSet call_info
 2642 
 2643 getTheta :: [TyCoBinder] -> [PredType]
 2644 getTheta = fmap tyBinderType . filter isInvisibleBinder . filter (not . isNamedBinder)
 2645 
 2646 
 2647 ------------------------------------------------------------
 2648 singleCall :: Id -> [SpecArg] -> UsageDetails
 2649 singleCall id args
 2650   = MkUD {ud_binds = emptyBag,
 2651           ud_calls = unitDVarEnv id $ CIS id $
 2652                      unitBag (CI { ci_key  = args -- used to be tys
 2653                                  , ci_fvs  = call_fvs }) }
 2654   where
 2655     call_fvs = foldr (unionVarSet . specArgFreeIds) emptyVarSet args
 2656         -- The type args (tys) are guaranteed to be part of the dictionary
 2657         -- types, because they are just the constrained types,
 2658         -- and the dictionary is therefore sure to be bound
 2659         -- inside the binding for any type variables free in the type;
 2660         -- hence it's safe to neglect tyvars free in tys when making
 2661         -- the free-var set for this call
 2662         -- BUT I don't trust this reasoning; play safe and include tys_fvs
 2663         --
 2664         -- We don't include the 'id' itself.
 2665 
 2666 mkCallUDs, mkCallUDs' :: SpecEnv -> Id -> [CoreExpr] -> UsageDetails
 2667 mkCallUDs env f args
 2668   = -- pprTrace "mkCallUDs" (vcat [ ppr f, ppr args, ppr res ])
 2669     res
 2670   where
 2671     res = mkCallUDs' env f args
 2672 
 2673 mkCallUDs' env f args
 2674   | wantCallsFor env f    -- We want it, and...
 2675   , not (null ci_key)     -- this call site has a useful specialisation
 2676   = -- pprTrace "mkCallUDs: keeping" _trace_doc
 2677     singleCall f ci_key
 2678 
 2679   | otherwise  -- See also Note [Specialisations already covered]
 2680   = -- pprTrace "mkCallUDs: discarding" _trace_doc
 2681     emptyUDs
 2682 
 2683   where
 2684     _trace_doc = vcat [ppr f, ppr args, ppr ci_key]
 2685     pis                = fst $ splitPiTys $ idType f
 2686     constrained_tyvars = tyCoVarsOfTypes $ getTheta pis
 2687 
 2688     ci_key :: [SpecArg]
 2689     ci_key = dropWhileEndLE (not . isSpecDict) $
 2690              zipWith mk_spec_arg args pis
 2691              -- Drop trailing args until we get to a SpecDict
 2692              -- In this way the RULE has as few args as possible,
 2693              -- which broadens its applicability, since rules only
 2694              -- fire when saturated
 2695 
 2696     mk_spec_arg :: CoreExpr -> TyCoBinder -> SpecArg
 2697     mk_spec_arg arg (Named bndr)
 2698       |  binderVar bndr `elemVarSet` constrained_tyvars
 2699       = case arg of
 2700           Type ty -> SpecType ty
 2701           _       -> pprPanic "ci_key" $ ppr arg
 2702       |  otherwise = UnspecType
 2703 
 2704     -- For "InvisArg", which are the type-class dictionaries,
 2705     -- we decide on a case by case basis if we want to specialise
 2706     -- on this argument; if so, SpecDict, if not UnspecArg
 2707     mk_spec_arg arg (Anon InvisArg pred)
 2708       | not (isIPLikePred (scaledThing pred))
 2709               -- See Note [Type determines value]
 2710       , interestingDict env arg
 2711               -- See Note [Interesting dictionary arguments]
 2712       = SpecDict arg
 2713 
 2714       | otherwise = UnspecArg
 2715 
 2716     mk_spec_arg _ (Anon VisArg _)
 2717       = UnspecArg
 2718 
 2719 wantCallsFor :: SpecEnv -> Id -> Bool
 2720 wantCallsFor _env _f = True
 2721  -- We could reduce the size of the UsageDetails by being less eager
 2722  -- about collecting calls for LocalIds: there is no point for
 2723  -- ones that are lambda-bound.  We can't decide this by looking at
 2724  -- the (absence of an) unfolding, because unfoldings for local
 2725  -- functions are discarded by cloneBindSM, so no local binder will
 2726  -- have an unfolding at this stage.  We'd have to keep a candidate
 2727  -- set of let-binders.
 2728  --
 2729  -- Not many lambda-bound variables have dictionary arguments, so
 2730  -- this would make little difference anyway.
 2731  --
 2732  -- For imported Ids we could check for an unfolding, but we have to
 2733  -- do so anyway in canSpecImport, and it seems better to have it
 2734  -- all in one place.  So we simply collect usage info for imported
 2735  -- overloaded functions.
 2736 
 2737 {- Note [Type determines value]
 2738 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 2739 Only specialise on non-impicit-parameter predicates, because these
 2740 are the ones whose *type* determines their *value*.  In particular,
 2741 with implicit params, the type args *don't* say what the value of the
 2742 implicit param is!  See #7101.
 2743 
 2744 So we treat implicit params just like ordinary arguments for the
 2745 purposes of specialisation.  Note that we still want to specialise
 2746 functions with implicit params if they have *other* dicts which are
 2747 class params; see #17930.
 2748 
 2749 Note [Interesting dictionary arguments]
 2750 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 2751 Consider this
 2752          \a.\d:Eq a.  let f = ... in ...(f d)...
 2753 There really is not much point in specialising f wrt the dictionary d,
 2754 because the code for the specialised f is not improved at all, because
 2755 d is lambda-bound.  We simply get junk specialisations.
 2756 
 2757 What is "interesting"?  Just that it has *some* structure.  But what about
 2758 variables?
 2759 
 2760  * A variable might be imported, in which case its unfolding
 2761    will tell us whether it has useful structure
 2762 
 2763  * Local variables are cloned on the way down (to avoid clashes when
 2764    we float dictionaries), and cloning drops the unfolding
 2765    (cloneIdBndr).  Moreover, we make up some new bindings, and it's a
 2766    nuisance to give them unfoldings.  So we keep track of the
 2767    "interesting" dictionaries as a VarSet in SpecEnv.
 2768    We have to take care to put any new interesting dictionary
 2769    bindings in the set.
 2770 
 2771 We accidentally lost accurate tracking of local variables for a long
 2772 time, because cloned variables don't have unfoldings. But makes a
 2773 massive difference in a few cases, eg #5113. For nofib as a
 2774 whole it's only a small win: 2.2% improvement in allocation for ansi,
 2775 1.2% for bspt, but mostly 0.0!  Average 0.1% increase in binary size.
 2776 -}
 2777 
 2778 interestingDict :: SpecEnv -> CoreExpr -> Bool
 2779 -- A dictionary argument is interesting if it has *some* structure
 2780 -- NB: "dictionary" arguments include constraints of all sorts,
 2781 --     including equality constraints; hence the Coercion case
 2782 interestingDict env (Var v) =  hasSomeUnfolding (idUnfolding v)
 2783                             || isDataConWorkId v
 2784                             || v `elemVarSet` se_interesting env
 2785 interestingDict _ (Type _)                = False
 2786 interestingDict _ (Coercion _)            = False
 2787 interestingDict env (App fn (Type _))     = interestingDict env fn
 2788 interestingDict env (App fn (Coercion _)) = interestingDict env fn
 2789 interestingDict env (Tick _ a)            = interestingDict env a
 2790 interestingDict env (Cast e _)            = interestingDict env e
 2791 interestingDict _ _                       = True
 2792 
 2793 plusUDs :: UsageDetails -> UsageDetails -> UsageDetails
 2794 plusUDs (MkUD {ud_binds = db1, ud_calls = calls1})
 2795         (MkUD {ud_binds = db2, ud_calls = calls2})
 2796   = MkUD { ud_binds = db1    `unionBags`   db2
 2797          , ud_calls = calls1 `unionCalls`  calls2 }
 2798 
 2799 -----------------------------
 2800 _dictBindBndrs :: Bag DictBind -> [Id]
 2801 _dictBindBndrs dbs = foldr ((++) . bindersOf . db_bind) [] dbs
 2802 
 2803 -- | Construct a 'DictBind' from a 'CoreBind'
 2804 mkDB :: CoreBind -> DictBind
 2805 mkDB bind = DB { db_bind = bind, db_fvs = bind_fvs bind }
 2806 
 2807 -- | Identify the free variables of a 'CoreBind'
 2808 bind_fvs :: CoreBind -> VarSet
 2809 bind_fvs (NonRec bndr rhs) = pair_fvs (bndr,rhs)
 2810 bind_fvs (Rec prs)         = foldl' delVarSet rhs_fvs bndrs
 2811                            where
 2812                              bndrs = map fst prs
 2813                              rhs_fvs = unionVarSets (map pair_fvs prs)
 2814 
 2815 pair_fvs :: (Id, CoreExpr) -> VarSet
 2816 pair_fvs (bndr, rhs) = exprSomeFreeVars interesting rhs
 2817                        `unionVarSet` idFreeVars bndr
 2818         -- idFreeVars: don't forget variables mentioned in
 2819         -- the rules of the bndr.  C.f. OccAnal.addRuleUsage
 2820         -- Also tyvars mentioned in its type; they may not appear
 2821         -- in the RHS
 2822         --      type T a = Int
 2823         --      x :: T a = 3
 2824   where
 2825     interesting :: InterestingVarFun
 2826     interesting v = isLocalVar v || (isId v && isDFunId v)
 2827         -- Very important: include DFunIds /even/ if it is imported
 2828         -- Reason: See Note [Avoiding loops in specImports], the #13429
 2829         --         example involving an imported dfun.  We must know
 2830         --         whether a dictionary binding depends on an imported
 2831         --         DFun in case we try to specialise that imported DFun
 2832 
 2833 -- | Flatten a set of "dumped" 'DictBind's, and some other binding
 2834 -- pairs, into a single recursive binding.
 2835 recWithDumpedDicts :: [(Id,CoreExpr)] -> Bag DictBind -> DictBind
 2836 recWithDumpedDicts pairs dbs
 2837   = DB { db_bind = Rec bindings, db_fvs = fvs }
 2838   where
 2839     (bindings, fvs) = foldr add ([], emptyVarSet)
 2840                                 (dbs `snocBag` mkDB (Rec pairs))
 2841     add (DB { db_bind = bind, db_fvs = fvs }) (prs_acc, fvs_acc)
 2842       = case bind of
 2843           NonRec b r -> ((b,r) : prs_acc, fvs')
 2844           Rec prs1   -> (prs1 ++ prs_acc, fvs')
 2845       where
 2846         fvs' = fvs_acc `unionVarSet` fvs
 2847 
 2848 snocDictBinds :: UsageDetails -> [DictBind] -> UsageDetails
 2849 -- Add ud_binds to the tail end of the bindings in uds
 2850 snocDictBinds uds dbs
 2851   = uds { ud_binds = ud_binds uds `unionBags` listToBag dbs }
 2852 
 2853 consDictBind :: DictBind -> UsageDetails -> UsageDetails
 2854 consDictBind bind uds = uds { ud_binds = bind `consBag` ud_binds uds }
 2855 
 2856 addDictBinds :: [DictBind] -> UsageDetails -> UsageDetails
 2857 addDictBinds binds uds = uds { ud_binds = listToBag binds `unionBags` ud_binds uds }
 2858 
 2859 snocDictBind :: UsageDetails -> DictBind -> UsageDetails
 2860 snocDictBind uds bind = uds { ud_binds = ud_binds uds `snocBag` bind }
 2861 
 2862 wrapDictBinds :: Bag DictBind -> [CoreBind] -> [CoreBind]
 2863 wrapDictBinds dbs binds
 2864   = foldr add binds dbs
 2865   where
 2866     add (DB { db_bind = bind }) binds = bind : binds
 2867 
 2868 wrapDictBindsE :: Bag DictBind -> CoreExpr -> CoreExpr
 2869 wrapDictBindsE dbs expr
 2870   = foldr add expr dbs
 2871   where
 2872     add (DB { db_bind = bind }) expr = Let bind expr
 2873 
 2874 ----------------------
 2875 dumpUDs :: [CoreBndr] -> UsageDetails -> (UsageDetails, Bag DictBind)
 2876 -- Used at a lambda or case binder; just dump anything mentioning the binder
 2877 dumpUDs bndrs uds@(MkUD { ud_binds = orig_dbs, ud_calls = orig_calls })
 2878   | null bndrs = (uds, emptyBag)  -- Common in case alternatives
 2879   | otherwise  = -- pprTrace "dumpUDs" (ppr bndrs $$ ppr free_uds $$ ppr dump_dbs) $
 2880                  (free_uds, dump_dbs)
 2881   where
 2882     free_uds = MkUD { ud_binds = free_dbs, ud_calls = free_calls }
 2883     bndr_set = mkVarSet bndrs
 2884     (free_dbs, dump_dbs, dump_set) = splitDictBinds orig_dbs bndr_set
 2885     free_calls = deleteCallsMentioning dump_set $   -- Drop calls mentioning bndr_set on the floor
 2886                  deleteCallsFor bndrs orig_calls    -- Discard calls for bndr_set; there should be
 2887                                                     -- no calls for any of the dicts in dump_dbs
 2888 
 2889 dumpBindUDs :: [CoreBndr] -> UsageDetails -> (UsageDetails, Bag DictBind, Bool)
 2890 -- Used at a let(rec) binding.
 2891 -- We return a boolean indicating whether the binding itself is mentioned,
 2892 -- directly or indirectly, by any of the ud_calls; in that case we want to
 2893 -- float the binding itself;
 2894 -- See Note [Floated dictionary bindings]
 2895 dumpBindUDs bndrs (MkUD { ud_binds = orig_dbs, ud_calls = orig_calls })
 2896   = -- pprTrace "dumpBindUDs" (ppr bndrs $$ ppr free_uds $$ ppr dump_dbs) $
 2897     (free_uds, dump_dbs, float_all)
 2898   where
 2899     free_uds = MkUD { ud_binds = free_dbs, ud_calls = free_calls }
 2900     bndr_set = mkVarSet bndrs
 2901     (free_dbs, dump_dbs, dump_set) = splitDictBinds orig_dbs bndr_set
 2902     free_calls = deleteCallsFor bndrs orig_calls
 2903     float_all = dump_set `intersectsVarSet` callDetailsFVs free_calls
 2904 
 2905 callsForMe :: Id -> UsageDetails -> (UsageDetails, [CallInfo])
 2906 callsForMe fn (MkUD { ud_binds = orig_dbs, ud_calls = orig_calls })
 2907   = -- pprTrace ("callsForMe")
 2908     --          (vcat [ppr fn,
 2909     --                 text "Orig dbs ="     <+> ppr (_dictBindBndrs orig_dbs),
 2910     --                 text "Orig calls ="   <+> ppr orig_calls,
 2911     --                 text "Dep set ="      <+> ppr dep_set,
 2912     --                 text "Calls for me =" <+> ppr calls_for_me]) $
 2913     (uds_without_me, calls_for_me)
 2914   where
 2915     uds_without_me = MkUD { ud_binds = orig_dbs
 2916                           , ud_calls = delDVarEnv orig_calls fn }
 2917     calls_for_me = case lookupDVarEnv orig_calls fn of
 2918                         Nothing -> []
 2919                         Just cis -> filterCalls cis orig_dbs
 2920          -- filterCalls: drop calls that (directly or indirectly)
 2921          -- refer to fn.  See Note [Avoiding loops (DFuns)]
 2922 
 2923 ----------------------
 2924 filterCalls :: CallInfoSet -> Bag DictBind -> [CallInfo]
 2925 -- Remove dominated calls (Note [Specialising polymorphic dictionaries])
 2926 -- and loopy DFuns (Note [Avoiding loops (DFuns)])
 2927 filterCalls (CIS fn call_bag) dbs
 2928   | isDFunId fn  -- Note [Avoiding loops (DFuns)] applies only to DFuns
 2929   = filter ok_call de_dupd_calls
 2930   | otherwise         -- Do not apply it to non-DFuns
 2931   = de_dupd_calls  -- See Note [Avoiding loops (non-DFuns)]
 2932   where
 2933     de_dupd_calls = remove_dups call_bag
 2934 
 2935     dump_set = foldl' go (unitVarSet fn) dbs
 2936       -- This dump-set could also be computed by splitDictBinds
 2937       --   (_,_,dump_set) = splitDictBinds dbs {fn}
 2938       -- But this variant is shorter
 2939 
 2940     go so_far (DB { db_bind = bind, db_fvs = fvs })
 2941        | fvs `intersectsVarSet` so_far
 2942        = extendVarSetList so_far (bindersOf bind)
 2943        | otherwise = so_far
 2944 
 2945     ok_call (CI { ci_fvs = fvs }) = fvs `disjointVarSet` dump_set
 2946 
 2947 remove_dups :: Bag CallInfo -> [CallInfo]
 2948 remove_dups calls = foldr add [] calls
 2949   where
 2950     add :: CallInfo -> [CallInfo] -> [CallInfo]
 2951     add ci [] = [ci]
 2952     add ci1 (ci2:cis) | ci2 `beats_or_same` ci1 = ci2:cis
 2953                       | ci1 `beats_or_same` ci2 = ci1:cis
 2954                       | otherwise               = ci2 : add ci1 cis
 2955 
 2956 beats_or_same :: CallInfo -> CallInfo -> Bool
 2957 beats_or_same (CI { ci_key = args1 }) (CI { ci_key = args2 })
 2958   = go args1 args2
 2959   where
 2960     go [] _ = True
 2961     go (arg1:args1) (arg2:args2) = go_arg arg1 arg2 && go args1 args2
 2962     go (_:_)        []           = False
 2963 
 2964     go_arg (SpecType ty1) (SpecType ty2) = isJust (tcMatchTy ty1 ty2)
 2965     go_arg UnspecType     UnspecType     = True
 2966     go_arg (SpecDict {})  (SpecDict {})  = True
 2967     go_arg UnspecArg      UnspecArg      = True
 2968     go_arg _              _              = False
 2969 
 2970 ----------------------
 2971 splitDictBinds :: Bag DictBind -> IdSet -> (Bag DictBind, Bag DictBind, IdSet)
 2972 -- splitDictBinds dbs bndrs returns
 2973 --   (free_dbs, dump_dbs, dump_set)
 2974 -- where
 2975 --   * dump_dbs depends, transitively on bndrs
 2976 --   * free_dbs does not depend on bndrs
 2977 --   * dump_set = bndrs `union` bndrs(dump_dbs)
 2978 splitDictBinds dbs bndr_set
 2979    = foldl' split_db (emptyBag, emptyBag, bndr_set) dbs
 2980                 -- Important that it's foldl' not foldr;
 2981                 -- we're accumulating the set of dumped ids in dump_set
 2982    where
 2983     split_db (free_dbs, dump_dbs, dump_idset) db
 2984         | DB { db_bind = bind, db_fvs = fvs } <- db
 2985         , dump_idset `intersectsVarSet` fvs     -- Dump it
 2986         = (free_dbs, dump_dbs `snocBag` db,
 2987            extendVarSetList dump_idset (bindersOf bind))
 2988 
 2989         | otherwise     -- Don't dump it
 2990         = (free_dbs `snocBag` db, dump_dbs, dump_idset)
 2991 
 2992 
 2993 ----------------------
 2994 deleteCallsMentioning :: VarSet -> CallDetails -> CallDetails
 2995 -- Remove calls mentioning any Id in bndrs
 2996 -- NB: The call is allowed to mention TyVars in bndrs
 2997 --     Note [Specialising polymorphic dictionaries]
 2998 --     ci_fvs are just the free /Ids/
 2999 deleteCallsMentioning bndrs calls
 3000   = mapDVarEnv (ciSetFilter keep_call) calls
 3001   where
 3002     keep_call (CI { ci_fvs = fvs }) = fvs `disjointVarSet` bndrs
 3003 
 3004 deleteCallsFor :: [Id] -> CallDetails -> CallDetails
 3005 -- Remove calls *for* bndrs
 3006 deleteCallsFor bndrs calls = delDVarEnvList calls bndrs
 3007 
 3008 {-
 3009 ************************************************************************
 3010 *                                                                      *
 3011 \subsubsection{Boring helper functions}
 3012 *                                                                      *
 3013 ************************************************************************
 3014 -}
 3015 
 3016 type SpecM a = UniqSM a
 3017 
 3018 runSpecM :: SpecM a -> CoreM a
 3019 runSpecM thing_inside
 3020   = do { us <- getUniqueSupplyM
 3021        ; return (initUs_ us thing_inside) }
 3022 
 3023 mapAndCombineSM :: (a -> SpecM (b, UsageDetails)) -> [a] -> SpecM ([b], UsageDetails)
 3024 mapAndCombineSM _ []     = return ([], emptyUDs)
 3025 mapAndCombineSM f (x:xs) = do (y, uds1) <- f x
 3026                               (ys, uds2) <- mapAndCombineSM f xs
 3027                               return (y:ys, uds1 `plusUDs` uds2)
 3028 
 3029 extendTvSubstList :: SpecEnv -> [(TyVar,Type)] -> SpecEnv
 3030 extendTvSubstList env tv_binds
 3031   = env { se_subst = Core.extendTvSubstList (se_subst env) tv_binds }
 3032 
 3033 substTy :: SpecEnv -> Type -> Type
 3034 substTy env ty = Core.substTy (se_subst env) ty
 3035 
 3036 substCo :: SpecEnv -> Coercion -> Coercion
 3037 substCo env co = Core.substCo (se_subst env) co
 3038 
 3039 substBndr :: SpecEnv -> CoreBndr -> (SpecEnv, CoreBndr)
 3040 substBndr env bs = case Core.substBndr (se_subst env) bs of
 3041                       (subst', bs') -> (env { se_subst = subst' }, bs')
 3042 
 3043 substBndrs :: SpecEnv -> [CoreBndr] -> (SpecEnv, [CoreBndr])
 3044 substBndrs env bs = case Core.substBndrs (se_subst env) bs of
 3045                       (subst', bs') -> (env { se_subst = subst' }, bs')
 3046 
 3047 cloneBindSM :: SpecEnv -> CoreBind -> SpecM (SpecEnv, SpecEnv, CoreBind)
 3048 -- Clone the binders of the bind; return new bind with the cloned binders
 3049 -- Return the substitution to use for RHSs, and the one to use for the body
 3050 cloneBindSM env@(SE { se_subst = subst, se_interesting = interesting }) (NonRec bndr rhs)
 3051   = do { us <- getUniqueSupplyM
 3052        ; let (subst', bndr') = Core.cloneIdBndr subst us bndr
 3053              interesting' | interestingDict env rhs
 3054                           = interesting `extendVarSet` bndr'
 3055                           | otherwise = interesting
 3056        ; return (env, env { se_subst = subst', se_interesting = interesting' }
 3057                 , NonRec bndr' rhs) }
 3058 
 3059 cloneBindSM env@(SE { se_subst = subst, se_interesting = interesting }) (Rec pairs)
 3060   = do { us <- getUniqueSupplyM
 3061        ; let (subst', bndrs') = Core.cloneRecIdBndrs subst us (map fst pairs)
 3062              env' = env { se_subst = subst'
 3063                         , se_interesting = interesting `extendVarSetList`
 3064                                            [ v | (v,r) <- pairs, interestingDict env r ] }
 3065        ; return (env', env', Rec (bndrs' `zip` map snd pairs)) }
 3066 
 3067 newDictBndr :: SpecEnv -> CoreBndr -> SpecM CoreBndr
 3068 -- Make up completely fresh binders for the dictionaries
 3069 -- Their bindings are going to float outwards
 3070 newDictBndr env b = do { uniq <- getUniqueM
 3071                         ; let n   = idName b
 3072                               ty' = substTy env (idType b)
 3073                         ; return (mkUserLocal (nameOccName n) uniq Many ty' (getSrcSpan n)) }
 3074 
 3075 newSpecIdSM :: Id -> Type -> Maybe JoinArity -> SpecM Id
 3076     -- Give the new Id a similar occurrence name to the old one
 3077 newSpecIdSM old_id new_ty join_arity_maybe
 3078   = do  { uniq <- getUniqueM
 3079         ; let name    = idName old_id
 3080               new_occ = mkSpecOcc (nameOccName name)
 3081               new_id  = mkUserLocal new_occ uniq Many new_ty (getSrcSpan name)
 3082                           `asJoinId_maybe` join_arity_maybe
 3083         ; return new_id }
 3084 
 3085 {-
 3086                 Old (but interesting) stuff about unboxed bindings
 3087                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 3088 
 3089 What should we do when a value is specialised to a *strict* unboxed value?
 3090 
 3091         map_*_* f (x:xs) = let h = f x
 3092                                t = map f xs
 3093                            in h:t
 3094 
 3095 Could convert let to case:
 3096 
 3097         map_*_Int# f (x:xs) = case f x of h# ->
 3098                               let t = map f xs
 3099                               in h#:t
 3100 
 3101 This may be undesirable since it forces evaluation here, but the value
 3102 may not be used in all branches of the body. In the general case this
 3103 transformation is impossible since the mutual recursion in a letrec
 3104 cannot be expressed as a case.
 3105 
 3106 There is also a problem with top-level unboxed values, since our
 3107 implementation cannot handle unboxed values at the top level.
 3108 
 3109 Solution: Lift the binding of the unboxed value and extract it when it
 3110 is used:
 3111 
 3112         map_*_Int# f (x:xs) = let h = case (f x) of h# -> _Lift h#
 3113                                   t = map f xs
 3114                               in case h of
 3115                                  _Lift h# -> h#:t
 3116 
 3117 Now give it to the simplifier and the _Lifting will be optimised away.
 3118 
 3119 The benefit is that we have given the specialised "unboxed" values a
 3120 very simple lifted semantics and then leave it up to the simplifier to
 3121 optimise it --- knowing that the overheads will be removed in nearly
 3122 all cases.
 3123 
 3124 In particular, the value will only be evaluated in the branches of the
 3125 program which use it, rather than being forced at the point where the
 3126 value is bound. For example:
 3127 
 3128         filtermap_*_* p f (x:xs)
 3129           = let h = f x
 3130                 t = ...
 3131             in case p x of
 3132                 True  -> h:t
 3133                 False -> t
 3134    ==>
 3135         filtermap_*_Int# p f (x:xs)
 3136           = let h = case (f x) of h# -> _Lift h#
 3137                 t = ...
 3138             in case p x of
 3139                 True  -> case h of _Lift h#
 3140                            -> h#:t
 3141                 False -> t
 3142 
 3143 The binding for h can still be inlined in the one branch and the
 3144 _Lifting eliminated.
 3145 
 3146 
 3147 Question: When won't the _Lifting be eliminated?
 3148 
 3149 Answer: When they at the top-level (where it is necessary) or when
 3150 inlining would duplicate work (or possibly code depending on
 3151 options). However, the _Lifting will still be eliminated if the
 3152 strictness analyser deems the lifted binding strict.
 3153 -}