never executed always true always false
1
2
3 {-# OPTIONS_GHC -Wno-incomplete-uni-patterns #-}
4
5 {-
6 (c) The GRASP/AQUA Project, Glasgow University, 1993-1998
7
8 \section[Specialise]{Stamping out overloading, and (optionally) polymorphism}
9 -}
10
11 module GHC.Core.Opt.Specialise ( specProgram, specUnfolding ) where
12
13 import GHC.Prelude
14
15 import GHC.Driver.Session
16 import GHC.Driver.Ppr
17 import GHC.Driver.Config
18 import GHC.Driver.Config.Diagnostic
19 import GHC.Driver.Env
20
21 import GHC.Tc.Utils.TcType hiding( substTy )
22
23 import GHC.Core.Type hiding( substTy, extendTvSubstList )
24 import GHC.Core.Multiplicity
25 import GHC.Core.Predicate
26 import GHC.Core.Coercion( Coercion )
27 import GHC.Core.Opt.Monad
28 import qualified GHC.Core.Subst as Core
29 import GHC.Core.Unfold.Make
30 import GHC.Core
31 import GHC.Core.Make ( mkLitRubbish )
32 import GHC.Core.Unify ( tcMatchTy )
33 import GHC.Core.Rules
34 import GHC.Core.Utils ( exprIsTrivial, getIdFromTrivialExpr_maybe
35 , mkCast, exprType )
36 import GHC.Core.FVs
37 import GHC.Core.TyCo.Rep (TyCoBinder (..))
38 import GHC.Core.Opt.Arity ( collectBindersPushingCo
39 , etaExpandToJoinPointRule )
40
41 import GHC.Builtin.Types ( unboxedUnitTy )
42
43 import GHC.Data.Maybe ( mapMaybe, maybeToList, isJust )
44 import GHC.Data.Bag
45 import GHC.Data.FastString
46 import GHC.Data.List.SetOps
47
48 import GHC.Types.Basic
49 import GHC.Types.Unique.Supply
50 import GHC.Types.Unique.DFM
51 import GHC.Types.Name
52 import GHC.Types.Tickish
53 import GHC.Types.Id.Make ( voidArgId, voidPrimId )
54 import GHC.Types.Var ( isLocalVar )
55 import GHC.Types.Var.Set
56 import GHC.Types.Var.Env
57 import GHC.Types.Id
58 import GHC.Types.Error
59
60 import GHC.Utils.Error ( mkMCDiagnostic )
61 import GHC.Utils.Monad ( foldlM )
62 import GHC.Utils.Misc
63 import GHC.Utils.Outputable
64 import GHC.Utils.Panic
65 import GHC.Utils.Trace
66
67 import GHC.Unit.Module( Module )
68 import GHC.Unit.Module.ModGuts
69 import GHC.Unit.External
70
71 {-
72 ************************************************************************
73 * *
74 \subsection[notes-Specialise]{Implementation notes [SLPJ, Aug 18 1993]}
75 * *
76 ************************************************************************
77
78 These notes describe how we implement specialisation to eliminate
79 overloading.
80
81 The specialisation pass works on Core
82 syntax, complete with all the explicit dictionary application,
83 abstraction and construction as added by the type checker. The
84 existing type checker remains largely as it is.
85
86 One important thought: the {\em types} passed to an overloaded
87 function, and the {\em dictionaries} passed are mutually redundant.
88 If the same function is applied to the same type(s) then it is sure to
89 be applied to the same dictionary(s)---or rather to the same {\em
90 values}. (The arguments might look different but they will evaluate
91 to the same value.)
92
93 Second important thought: we know that we can make progress by
94 treating dictionary arguments as static and worth specialising on. So
95 we can do without binding-time analysis, and instead specialise on
96 dictionary arguments and no others.
97
98 The basic idea
99 ~~~~~~~~~~~~~~
100 Suppose we have
101
102 let f = <f_rhs>
103 in <body>
104
105 and suppose f is overloaded.
106
107 STEP 1: CALL-INSTANCE COLLECTION
108
109 We traverse <body>, accumulating all applications of f to types and
110 dictionaries.
111
112 (Might there be partial applications, to just some of its types and
113 dictionaries? In principle yes, but in practice the type checker only
114 builds applications of f to all its types and dictionaries, so partial
115 applications could only arise as a result of transformation, and even
116 then I think it's unlikely. In any case, we simply don't accumulate such
117 partial applications.)
118
119
120 STEP 2: EQUIVALENCES
121
122 So now we have a collection of calls to f:
123 f t1 t2 d1 d2
124 f t3 t4 d3 d4
125 ...
126 Notice that f may take several type arguments. To avoid ambiguity, we
127 say that f is called at type t1/t2 and t3/t4.
128
129 We take equivalence classes using equality of the *types* (ignoring
130 the dictionary args, which as mentioned previously are redundant).
131
132 STEP 3: SPECIALISATION
133
134 For each equivalence class, choose a representative (f t1 t2 d1 d2),
135 and create a local instance of f, defined thus:
136
137 f@t1/t2 = <f_rhs> t1 t2 d1 d2
138
139 f_rhs presumably has some big lambdas and dictionary lambdas, so lots
140 of simplification will now result. However we don't actually *do* that
141 simplification. Rather, we leave it for the simplifier to do. If we
142 *did* do it, though, we'd get more call instances from the specialised
143 RHS. We can work out what they are by instantiating the call-instance
144 set from f's RHS with the types t1, t2.
145
146 Add this new id to f's IdInfo, to record that f has a specialised version.
147
148 Before doing any of this, check that f's IdInfo doesn't already
149 tell us about an existing instance of f at the required type/s.
150 (This might happen if specialisation was applied more than once, or
151 it might arise from user SPECIALIZE pragmas.)
152
153 Recursion
154 ~~~~~~~~~
155 Wait a minute! What if f is recursive? Then we can't just plug in
156 its right-hand side, can we?
157
158 But it's ok. The type checker *always* creates non-recursive definitions
159 for overloaded recursive functions. For example:
160
161 f x = f (x+x) -- Yes I know its silly
162
163 becomes
164
165 f a (d::Num a) = let p = +.sel a d
166 in
167 letrec fl (y::a) = fl (p y y)
168 in
169 fl
170
171 We still have recursion for non-overloaded functions which we
172 specialise, but the recursive call should get specialised to the
173 same recursive version.
174
175
176 Polymorphism 1
177 ~~~~~~~~~~~~~~
178
179 All this is crystal clear when the function is applied to *constant
180 types*; that is, types which have no type variables inside. But what if
181 it is applied to non-constant types? Suppose we find a call of f at type
182 t1/t2. There are two possibilities:
183
184 (a) The free type variables of t1, t2 are in scope at the definition point
185 of f. In this case there's no problem, we proceed just as before. A common
186 example is as follows. Here's the Haskell:
187
188 g y = let f x = x+x
189 in f y + f y
190
191 After typechecking we have
192
193 g a (d::Num a) (y::a) = let f b (d'::Num b) (x::b) = +.sel b d' x x
194 in +.sel a d (f a d y) (f a d y)
195
196 Notice that the call to f is at type type "a"; a non-constant type.
197 Both calls to f are at the same type, so we can specialise to give:
198
199 g a (d::Num a) (y::a) = let f@a (x::a) = +.sel a d x x
200 in +.sel a d (f@a y) (f@a y)
201
202
203 (b) The other case is when the type variables in the instance types
204 are *not* in scope at the definition point of f. The example we are
205 working with above is a good case. There are two instances of (+.sel a d),
206 but "a" is not in scope at the definition of +.sel. Can we do anything?
207 Yes, we can "common them up", a sort of limited common sub-expression deal.
208 This would give:
209
210 g a (d::Num a) (y::a) = let +.sel@a = +.sel a d
211 f@a (x::a) = +.sel@a x x
212 in +.sel@a (f@a y) (f@a y)
213
214 This can save work, and can't be spotted by the type checker, because
215 the two instances of +.sel weren't originally at the same type.
216
217 Further notes on (b)
218
219 * There are quite a few variations here. For example, the defn of
220 +.sel could be floated outside the \y, to attempt to gain laziness.
221 It certainly mustn't be floated outside the \d because the d has to
222 be in scope too.
223
224 * We don't want to inline f_rhs in this case, because
225 that will duplicate code. Just commoning up the call is the point.
226
227 * Nothing gets added to +.sel's IdInfo.
228
229 * Don't bother unless the equivalence class has more than one item!
230
231 Not clear whether this is all worth it. It is of course OK to
232 simply discard call-instances when passing a big lambda.
233
234 Polymorphism 2 -- Overloading
235 ~~~~~~~~~~~~~~
236 Consider a function whose most general type is
237
238 f :: forall a b. Ord a => [a] -> b -> b
239
240 There is really no point in making a version of g at Int/Int and another
241 at Int/Bool, because it's only instantiating the type variable "a" which
242 buys us any efficiency. Since g is completely polymorphic in b there
243 ain't much point in making separate versions of g for the different
244 b types.
245
246 That suggests that we should identify which of g's type variables
247 are constrained (like "a") and which are unconstrained (like "b").
248 Then when taking equivalence classes in STEP 2, we ignore the type args
249 corresponding to unconstrained type variable. In STEP 3 we make
250 polymorphic versions. Thus:
251
252 f@t1/ = /\b -> <f_rhs> t1 b d1 d2
253
254 We do this.
255
256
257 Dictionary floating
258 ~~~~~~~~~~~~~~~~~~~
259 Consider this
260
261 f a (d::Num a) = let g = ...
262 in
263 ...(let d1::Ord a = Num.Ord.sel a d in g a d1)...
264
265 Here, g is only called at one type, but the dictionary isn't in scope at the
266 definition point for g. Usually the type checker would build a
267 definition for d1 which enclosed g, but the transformation system
268 might have moved d1's defn inward. Solution: float dictionary bindings
269 outwards along with call instances.
270
271 Consider
272
273 f x = let g p q = p==q
274 h r s = (r+s, g r s)
275 in
276 h x x
277
278
279 Before specialisation, leaving out type abstractions we have
280
281 f df x = let g :: Eq a => a -> a -> Bool
282 g dg p q = == dg p q
283 h :: Num a => a -> a -> (a, Bool)
284 h dh r s = let deq = eqFromNum dh
285 in (+ dh r s, g deq r s)
286 in
287 h df x x
288
289 After specialising h we get a specialised version of h, like this:
290
291 h' r s = let deq = eqFromNum df
292 in (+ df r s, g deq r s)
293
294 But we can't naively make an instance for g from this, because deq is not in scope
295 at the defn of g. Instead, we have to float out the (new) defn of deq
296 to widen its scope. Notice that this floating can't be done in advance -- it only
297 shows up when specialisation is done.
298
299 User SPECIALIZE pragmas
300 ~~~~~~~~~~~~~~~~~~~~~~~
301 Specialisation pragmas can be digested by the type checker, and implemented
302 by adding extra definitions along with that of f, in the same way as before
303
304 f@t1/t2 = <f_rhs> t1 t2 d1 d2
305
306 Indeed the pragmas *have* to be dealt with by the type checker, because
307 only it knows how to build the dictionaries d1 and d2! For example
308
309 g :: Ord a => [a] -> [a]
310 {-# SPECIALIZE f :: [Tree Int] -> [Tree Int] #-}
311
312 Here, the specialised version of g is an application of g's rhs to the
313 Ord dictionary for (Tree Int), which only the type checker can conjure
314 up. There might not even *be* one, if (Tree Int) is not an instance of
315 Ord! (All the other specialision has suitable dictionaries to hand
316 from actual calls.)
317
318 Problem. The type checker doesn't have to hand a convenient <f_rhs>, because
319 it is buried in a complex (as-yet-un-desugared) binding group.
320 Maybe we should say
321
322 f@t1/t2 = f* t1 t2 d1 d2
323
324 where f* is the Id f with an IdInfo which says "inline me regardless!".
325 Indeed all the specialisation could be done in this way.
326 That in turn means that the simplifier has to be prepared to inline absolutely
327 any in-scope let-bound thing.
328
329
330 Again, the pragma should permit polymorphism in unconstrained variables:
331
332 h :: Ord a => [a] -> b -> b
333 {-# SPECIALIZE h :: [Int] -> b -> b #-}
334
335 We *insist* that all overloaded type variables are specialised to ground types,
336 (and hence there can be no context inside a SPECIALIZE pragma).
337 We *permit* unconstrained type variables to be specialised to
338 - a ground type
339 - or left as a polymorphic type variable
340 but nothing in between. So
341
342 {-# SPECIALIZE h :: [Int] -> [c] -> [c] #-}
343
344 is *illegal*. (It can be handled, but it adds complication, and gains the
345 programmer nothing.)
346
347
348 SPECIALISING INSTANCE DECLARATIONS
349 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
350 Consider
351
352 instance Foo a => Foo [a] where
353 ...
354 {-# SPECIALIZE instance Foo [Int] #-}
355
356 The original instance decl creates a dictionary-function
357 definition:
358
359 dfun.Foo.List :: forall a. Foo a -> Foo [a]
360
361 The SPECIALIZE pragma just makes a specialised copy, just as for
362 ordinary function definitions:
363
364 dfun.Foo.List@Int :: Foo [Int]
365 dfun.Foo.List@Int = dfun.Foo.List Int dFooInt
366
367 The information about what instance of the dfun exist gets added to
368 the dfun's IdInfo in the same way as a user-defined function too.
369
370
371 Automatic instance decl specialisation?
372 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
373 Can instance decls be specialised automatically? It's tricky.
374 We could collect call-instance information for each dfun, but
375 then when we specialised their bodies we'd get new call-instances
376 for ordinary functions; and when we specialised their bodies, we might get
377 new call-instances of the dfuns, and so on. This all arises because of
378 the unrestricted mutual recursion between instance decls and value decls.
379
380 Still, there's no actual problem; it just means that we may not do all
381 the specialisation we could theoretically do.
382
383 Furthermore, instance decls are usually exported and used non-locally,
384 so we'll want to compile enough to get those specialisations done.
385
386 Lastly, there's no such thing as a local instance decl, so we can
387 survive solely by spitting out *usage* information, and then reading that
388 back in as a pragma when next compiling the file. So for now,
389 we only specialise instance decls in response to pragmas.
390
391
392 SPITTING OUT USAGE INFORMATION
393 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
394
395 To spit out usage information we need to traverse the code collecting
396 call-instance information for all imported (non-prelude?) functions
397 and data types. Then we equivalence-class it and spit it out.
398
399 This is done at the top-level when all the call instances which escape
400 must be for imported functions and data types.
401
402 *** Not currently done ***
403
404
405 Partial specialisation by pragmas
406 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
407 What about partial specialisation:
408
409 k :: (Ord a, Eq b) => [a] -> b -> b -> [a]
410 {-# SPECIALIZE k :: Eq b => [Int] -> b -> b -> [a] #-}
411
412 or even
413
414 {-# SPECIALIZE k :: Eq b => [Int] -> [b] -> [b] -> [a] #-}
415
416 Seems quite reasonable. Similar things could be done with instance decls:
417
418 instance (Foo a, Foo b) => Foo (a,b) where
419 ...
420 {-# SPECIALIZE instance Foo a => Foo (a,Int) #-}
421 {-# SPECIALIZE instance Foo b => Foo (Int,b) #-}
422
423 Ho hum. Things are complex enough without this. I pass.
424
425
426 Requirements for the simplifier
427 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
428 The simplifier has to be able to take advantage of the specialisation.
429
430 * When the simplifier finds an application of a polymorphic f, it looks in
431 f's IdInfo in case there is a suitable instance to call instead. This converts
432
433 f t1 t2 d1 d2 ===> f_t1_t2
434
435 Note that the dictionaries get eaten up too!
436
437 * Dictionary selection operations on constant dictionaries must be
438 short-circuited:
439
440 +.sel Int d ===> +Int
441
442 The obvious way to do this is in the same way as other specialised
443 calls: +.sel has inside it some IdInfo which tells that if it's applied
444 to the type Int then it should eat a dictionary and transform to +Int.
445
446 In short, dictionary selectors need IdInfo inside them for constant
447 methods.
448
449 * Exactly the same applies if a superclass dictionary is being
450 extracted:
451
452 Eq.sel Int d ===> dEqInt
453
454 * Something similar applies to dictionary construction too. Suppose
455 dfun.Eq.List is the function taking a dictionary for (Eq a) to
456 one for (Eq [a]). Then we want
457
458 dfun.Eq.List Int d ===> dEq.List_Int
459
460 Where does the Eq [Int] dictionary come from? It is built in
461 response to a SPECIALIZE pragma on the Eq [a] instance decl.
462
463 In short, dfun Ids need IdInfo with a specialisation for each
464 constant instance of their instance declaration.
465
466 All this uses a single mechanism: the SpecEnv inside an Id
467
468
469 What does the specialisation IdInfo look like?
470 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
471
472 The SpecEnv of an Id maps a list of types (the template) to an expression
473
474 [Type] |-> Expr
475
476 For example, if f has this RuleInfo:
477
478 [Int, a] -> \d:Ord Int. f' a
479
480 it means that we can replace the call
481
482 f Int t ===> (\d. f' t)
483
484 This chucks one dictionary away and proceeds with the
485 specialised version of f, namely f'.
486
487
488 What can't be done this way?
489 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
490 There is no way, post-typechecker, to get a dictionary for (say)
491 Eq a from a dictionary for Eq [a]. So if we find
492
493 ==.sel [t] d
494
495 we can't transform to
496
497 eqList (==.sel t d')
498
499 where
500 eqList :: (a->a->Bool) -> [a] -> [a] -> Bool
501
502 Of course, we currently have no way to automatically derive
503 eqList, nor to connect it to the Eq [a] instance decl, but you
504 can imagine that it might somehow be possible. Taking advantage
505 of this is permanently ruled out.
506
507 Still, this is no great hardship, because we intend to eliminate
508 overloading altogether anyway!
509
510 A note about non-tyvar dictionaries
511 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
512 Some Ids have types like
513
514 forall a,b,c. Eq a -> Ord [a] -> tau
515
516 This seems curious at first, because we usually only have dictionary
517 args whose types are of the form (C a) where a is a type variable.
518 But this doesn't hold for the functions arising from instance decls,
519 which sometimes get arguments with types of form (C (T a)) for some
520 type constructor T.
521
522 Should we specialise wrt this compound-type dictionary? We used to say
523 "no", saying:
524 "This is a heuristic judgement, as indeed is the fact that we
525 specialise wrt only dictionaries. We choose *not* to specialise
526 wrt compound dictionaries because at the moment the only place
527 they show up is in instance decls, where they are simply plugged
528 into a returned dictionary. So nothing is gained by specialising
529 wrt them."
530
531 But it is simpler and more uniform to specialise wrt these dicts too;
532 and in future GHC is likely to support full fledged type signatures
533 like
534 f :: Eq [(a,b)] => ...
535
536
537 ************************************************************************
538 * *
539 \subsubsection{The new specialiser}
540 * *
541 ************************************************************************
542
543 Our basic game plan is this. For let(rec) bound function
544 f :: (C a, D c) => (a,b,c,d) -> Bool
545
546 * Find any specialised calls of f, (f ts ds), where
547 ts are the type arguments t1 .. t4, and
548 ds are the dictionary arguments d1 .. d2.
549
550 * Add a new definition for f1 (say):
551
552 f1 = /\ b d -> (..body of f..) t1 b t3 d d1 d2
553
554 Note that we abstract over the unconstrained type arguments.
555
556 * Add the mapping
557
558 [t1,b,t3,d] |-> \d1 d2 -> f1 b d
559
560 to the specialisations of f. This will be used by the
561 simplifier to replace calls
562 (f t1 t2 t3 t4) da db
563 by
564 (\d1 d1 -> f1 t2 t4) da db
565
566 All the stuff about how many dictionaries to discard, and what types
567 to apply the specialised function to, are handled by the fact that the
568 SpecEnv contains a template for the result of the specialisation.
569
570 We don't build *partial* specialisations for f. For example:
571
572 f :: Eq a => a -> a -> Bool
573 {-# SPECIALISE f :: (Eq b, Eq c) => (b,c) -> (b,c) -> Bool #-}
574
575 Here, little is gained by making a specialised copy of f.
576 There's a distinct danger that the specialised version would
577 first build a dictionary for (Eq b, Eq c), and then select the (==)
578 method from it! Even if it didn't, not a great deal is saved.
579
580 We do, however, generate polymorphic, but not overloaded, specialisations:
581
582 f :: Eq a => [a] -> b -> b -> b
583 ... SPECIALISE f :: [Int] -> b -> b -> b ...
584
585 Hence, the invariant is this:
586
587 *** no specialised version is overloaded ***
588
589
590 ************************************************************************
591 * *
592 \subsubsection{The exported function}
593 * *
594 ************************************************************************
595 -}
596
597 -- | Specialise calls to type-class overloaded functions occurring in a program.
598 specProgram :: ModGuts -> CoreM ModGuts
599 specProgram guts@(ModGuts { mg_module = this_mod
600 , mg_rules = local_rules
601 , mg_binds = binds })
602 = do { dflags <- getDynFlags
603
604 -- We need to start with a Subst that knows all the things
605 -- that are in scope, so that the substitution engine doesn't
606 -- accidentally re-use a unique that's already in use
607 -- Easiest thing is to do it all at once, as if all the top-level
608 -- decls were mutually recursive
609 ; let top_env = SE { se_subst = Core.mkEmptySubst $ mkInScopeSet $ mkVarSet $
610 bindersOfBinds binds
611 , se_interesting = emptyVarSet
612 , se_module = this_mod
613 , se_dflags = dflags }
614
615 go [] = return ([], emptyUDs)
616 go (bind:binds) = do (binds', uds) <- go binds
617 (bind', uds') <- specBind top_env bind uds
618 return (bind' ++ binds', uds')
619
620 -- Specialise the bindings of this module
621 ; (binds', uds) <- runSpecM (go binds)
622
623 ; (spec_rules, spec_binds) <- specImports top_env local_rules uds
624
625 ; return (guts { mg_binds = spec_binds ++ binds'
626 , mg_rules = spec_rules ++ local_rules }) }
627
628 {-
629 Note [Wrap bindings returned by specImports]
630 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
631 'specImports' returns a set of specialized bindings. However, these are lacking
632 necessary floated dictionary bindings, which are returned by
633 UsageDetails(ud_binds). These dictionaries need to be brought into scope with
634 'wrapDictBinds' before the bindings returned by 'specImports' can be used. See,
635 for instance, the 'specImports' call in 'specProgram'.
636
637
638 Note [Disabling cross-module specialisation]
639 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
640 Since GHC 7.10 we have performed specialisation of INLINABLE bindings living
641 in modules outside of the current module. This can sometimes uncover user code
642 which explodes in size when aggressively optimized. The
643 -fno-cross-module-specialise option was introduced to allow users to being
644 bitten by such instances to revert to the pre-7.10 behavior.
645
646 See #10491
647 -}
648
649
650 {- *********************************************************************
651 * *
652 Specialising imported functions
653 * *
654 ********************************************************************* -}
655
656 specImports :: SpecEnv
657 -> [CoreRule]
658 -> UsageDetails
659 -> CoreM ([CoreRule], [CoreBind])
660 specImports top_env local_rules
661 (MkUD { ud_binds = dict_binds, ud_calls = calls })
662 | not $ gopt Opt_CrossModuleSpecialise (se_dflags top_env)
663 -- See Note [Disabling cross-module specialisation]
664 = return ([], wrapDictBinds dict_binds [])
665
666 | otherwise
667 = do { hpt_rules <- getRuleBase
668 ; let rule_base = extendRuleBaseList hpt_rules local_rules
669
670 ; (spec_rules, spec_binds) <- spec_imports top_env [] rule_base
671 dict_binds calls
672
673 -- Don't forget to wrap the specialized bindings with
674 -- bindings for the needed dictionaries.
675 -- See Note [Wrap bindings returned by specImports]
676 -- and Note [Glom the bindings if imported functions are specialised]
677 ; let final_binds
678 | null spec_binds = wrapDictBinds dict_binds []
679 | otherwise = [Rec $ flattenBinds $
680 wrapDictBinds dict_binds spec_binds]
681
682 ; return (spec_rules, final_binds)
683 }
684
685 -- | Specialise a set of calls to imported bindings
686 spec_imports :: SpecEnv -- Passed in so that all top-level Ids are in scope
687 -> [Id] -- Stack of imported functions being specialised
688 -- See Note [specImport call stack]
689 -> RuleBase -- Rules from this module and the home package
690 -- (but not external packages, which can change)
691 -> Bag DictBind -- Dict bindings, used /only/ for filterCalls
692 -- See Note [Avoiding loops in specImports]
693 -> CallDetails -- Calls for imported things
694 -> CoreM ( [CoreRule] -- New rules
695 , [CoreBind] ) -- Specialised bindings
696 spec_imports top_env callers rule_base dict_binds calls
697 = do { let import_calls = dVarEnvElts calls
698 -- ; debugTraceMsg (text "specImports {" <+>
699 -- vcat [ text "calls:" <+> ppr import_calls
700 -- , text "dict_binds:" <+> ppr dict_binds ])
701 ; (rules, spec_binds) <- go rule_base import_calls
702 -- ; debugTraceMsg (text "End specImports }" <+> ppr import_calls)
703
704 ; return (rules, spec_binds) }
705 where
706 go :: RuleBase -> [CallInfoSet] -> CoreM ([CoreRule], [CoreBind])
707 go _ [] = return ([], [])
708 go rb (cis : other_calls)
709 = do { -- debugTraceMsg (text "specImport {" <+> ppr cis)
710 ; (rules1, spec_binds1) <- spec_import top_env callers rb dict_binds cis
711 -- ; debugTraceMsg (text "specImport }" <+> ppr cis)
712
713 ; (rules2, spec_binds2) <- go (extendRuleBaseList rb rules1) other_calls
714 ; return (rules1 ++ rules2, spec_binds1 ++ spec_binds2) }
715
716 spec_import :: SpecEnv -- Passed in so that all top-level Ids are in scope
717 -> [Id] -- Stack of imported functions being specialised
718 -- See Note [specImport call stack]
719 -> RuleBase -- Rules from this module
720 -> Bag DictBind -- Dict bindings, used /only/ for filterCalls
721 -- See Note [Avoiding loops in specImports]
722 -> CallInfoSet -- Imported function and calls for it
723 -> CoreM ( [CoreRule] -- New rules
724 , [CoreBind] ) -- Specialised bindings
725 spec_import top_env callers rb dict_binds cis@(CIS fn _)
726 | isIn "specImport" fn callers
727 = return ([], []) -- No warning. This actually happens all the time
728 -- when specialising a recursive function, because
729 -- the RHS of the specialised function contains a recursive
730 -- call to the original function
731
732 | null good_calls
733 = return ([], [])
734
735 | Just rhs <- canSpecImport dflags fn
736 = do { -- Get rules from the external package state
737 -- We keep doing this in case we "page-fault in"
738 -- more rules as we go along
739 ; hsc_env <- getHscEnv
740 ; eps <- liftIO $ hscEPS hsc_env
741 ; vis_orphs <- getVisibleOrphanMods
742 ; let full_rb = unionRuleBase rb (eps_rule_base eps)
743 rules_for_fn = getRules (RuleEnv full_rb vis_orphs) fn
744
745 ; (rules1, spec_pairs, MkUD { ud_binds = dict_binds1, ud_calls = new_calls })
746 <- -- debugTraceMsg (text "specImport1" <+> vcat [ppr fn, ppr good_calls, ppr rhs]) >>
747 (runSpecM $ specCalls True top_env rules_for_fn good_calls fn rhs)
748 ; let spec_binds1 = [NonRec b r | (b,r) <- spec_pairs]
749 -- After the rules kick in we may get recursion, but
750 -- we rely on a global GlomBinds to sort that out later
751 -- See Note [Glom the bindings if imported functions are specialised]
752
753 -- Now specialise any cascaded calls
754 -- ; debugTraceMsg (text "specImport 2" <+> (ppr fn $$ ppr rules1 $$ ppr spec_binds1))
755 ; (rules2, spec_binds2) <- spec_imports top_env
756 (fn:callers)
757 (extendRuleBaseList rb rules1)
758 (dict_binds `unionBags` dict_binds1)
759 new_calls
760
761 ; let final_binds = wrapDictBinds dict_binds1 $
762 spec_binds2 ++ spec_binds1
763
764 ; return (rules2 ++ rules1, final_binds) }
765
766 | otherwise
767 = do { tryWarnMissingSpecs dflags callers fn good_calls
768 ; return ([], [])}
769
770 where
771 dflags = se_dflags top_env
772 good_calls = filterCalls cis dict_binds
773 -- SUPER IMPORTANT! Drop calls that (directly or indirectly) refer to fn
774 -- See Note [Avoiding loops in specImports]
775
776 canSpecImport :: DynFlags -> Id -> Maybe CoreExpr
777 -- See Note [Specialise imported INLINABLE things]
778 canSpecImport dflags fn
779 | isDataConWrapId fn
780 = Nothing -- Don't specialise data-con wrappers, even if they
781 -- have dict args; there is no benefit.
782
783 | CoreUnfolding { uf_src = src, uf_tmpl = rhs } <- unf
784 , isStableSource src
785 = Just rhs -- By default, specialise only imported things that have a stable
786 -- unfolding; that is, have an INLINE or INLINABLE pragma
787 -- Specialise even INLINE things; it hasn't inlined yet,
788 -- so perhaps it never will. Moreover it may have calls
789 -- inside it that we want to specialise
790
791 -- CoreUnfolding case does /not/ include DFunUnfoldings;
792 -- We only specialise DFunUnfoldings with -fspecialise-aggressively
793 -- See Note [Do not specialise imported DFuns]
794
795 | gopt Opt_SpecialiseAggressively dflags
796 = maybeUnfoldingTemplate unf -- With -fspecialise-aggressively, specialise anything
797 -- with an unfolding, stable or not, DFun or not
798
799 | otherwise = Nothing
800 where
801 unf = realIdUnfolding fn -- We want to see the unfolding even for loop breakers
802
803 -- | Returns whether or not to show a missed-spec warning.
804 -- If -Wall-missed-specializations is on, show the warning.
805 -- Otherwise, if -Wmissed-specializations is on, only show a warning
806 -- if there is at least one imported function being specialized,
807 -- and if all imported functions are marked with an inline pragma
808 -- Use the most specific warning as the reason.
809 tryWarnMissingSpecs :: DynFlags -> [Id] -> Id -> [CallInfo] -> CoreM ()
810 -- See Note [Warning about missed specialisations]
811 tryWarnMissingSpecs dflags callers fn calls_for_fn
812 | isClassOpId fn = return () -- See Note [Missed specialization for ClassOps]
813 | wopt Opt_WarnMissedSpecs dflags
814 && not (null callers)
815 && allCallersInlined = doWarn $ WarningWithFlag Opt_WarnMissedSpecs
816 | wopt Opt_WarnAllMissedSpecs dflags = doWarn $ WarningWithFlag Opt_WarnAllMissedSpecs
817 | otherwise = return ()
818 where
819 allCallersInlined = all (isAnyInlinePragma . idInlinePragma) callers
820 diag_opts = initDiagOpts dflags
821 doWarn reason =
822 msg (mkMCDiagnostic diag_opts reason)
823 (vcat [ hang (text ("Could not specialise imported function") <+> quotes (ppr fn))
824 2 (vcat [ text "when specialising" <+> quotes (ppr caller)
825 | caller <- callers])
826 , whenPprDebug (text "calls:" <+> vcat (map (pprCallInfo fn) calls_for_fn))
827 , text "Probable fix: add INLINABLE pragma on" <+> quotes (ppr fn) ])
828
829 {- Note [Missed specialisation for ClassOps]
830 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
831 In #19592 I saw a number of missed specialisation warnings
832 which were the result of things like:
833
834 case isJumpishInstr @X86.Instr $dInstruction_s7f8 eta3_a78C of { ...
835
836 where isJumpishInstr is part of the Instruction class and defined like
837 this:
838
839 class Instruction instr where
840 ...
841 isJumpishInstr :: instr -> Bool
842 ...
843
844 isJumpishInstr is a ClassOp which will select the right method
845 from within the dictionary via our built in rules. See also
846 Note [ClassOp/DFun selection] in GHC.Tc.TyCl.Instance.
847
848 We don't give these unfoldings, and as a result the specialiser
849 complains. But usually this doesn't matter. The simplifier will
850 apply the rule and we end up with
851
852 case isJumpishInstrImplX86 eta3_a78C of { ...
853
854 Since isJumpishInstrImplX86 is defined for a concrete instance (given
855 by the dictionary) it is usually already well specialised!
856 Theoretically the implementation of a method could still be overloaded
857 over a different type class than what it's a method of. But I wasn't able
858 to make this go wrong, and SPJ thinks this should be fine as well.
859
860 So I decided to remove the warnings for failed specialisations on ClassOps
861 alltogether as they do more harm than good.
862 -}
863
864 {- Note [Do not specialise imported DFuns]
865 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
866 Ticket #18223 shows that specialising calls of DFuns is can cause a huge
867 and entirely unnecessary blowup in program size. Consider a call to
868 f @[[[[[[[[T]]]]]]]] d1 x
869 where df :: C a => C [a]
870 d1 :: C [[[[[[[[T]]]]]]]] = dfC[] @[[[[[[[T]]]]]]] d1
871 d2 :: C [[[[[[[T]]]]]]] = dfC[] @[[[[[[T]]]]]] d3
872 ...
873 Now we'll specialise f's RHS, which may give rise to calls to 'g',
874 also overloaded, which we will specialise, and so on. However, if
875 we specialise the calls to dfC[], we'll generate specialised copies of
876 all methods of C, at all types; and the same for C's superclasses.
877
878 And many of these specialised functions will never be called. We are
879 going to call the specialised 'f', and the specialised 'g', but DFuns
880 group functions into a tuple, many of whose elements may never be used.
881
882 With deeply-nested types this can lead to a simply overwhelming number
883 of specialisations: see #18223 for a simple example (from the wild).
884 I measured the number of specialisations for various numbers of calls
885 of `flip evalStateT ()`, and got this
886
887 Size after one simplification
888 #calls #SPEC rules Terms Types
889 5 56 3100 10600
890 9 108 13660 77206
891
892 The real tests case has 60+ calls, which blew GHC out of the water.
893
894 Solution: don't specialise DFuns. The downside is that if we end
895 up with (h (dfun d)), /and/ we don't specialise 'h', then we won't
896 pass to 'h' a tuple of specialised functions.
897
898 However, the flag -fspecialise-aggressively (experimental, off by default)
899 allows DFuns to specialise as well.
900
901 Note [Avoiding loops in specImports]
902 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
903 We must take great care when specialising instance declarations
904 (DFuns like $fOrdList) lest we accidentally build a recursive
905 dictionary. See Note [Avoiding loops (DFuns)].
906
907 The basic strategy of Note [Avoiding loops (DFuns)] is to use filterCalls
908 to discard loopy specialisations. But to do that we must ensure
909 that the in-scope dict-binds (passed to filterCalls) contains
910 all the needed dictionary bindings. In particular, in the recursive
911 call to spec_imports in spec_import, we must include the dict-binds
912 from the parent. Lacking this caused #17151, a really nasty bug.
913
914 Here is what happened.
915 * Class structure:
916 Source is a superclass of Mut
917 Index is a superclass of Source
918
919 * We started with these dict binds
920 dSource = $fSourcePix @Int $fIndexInt
921 dIndex = sc_sel dSource
922 dMut = $fMutPix @Int dIndex
923 and these calls to specialise
924 $fMutPix @Int dIndex
925 $fSourcePix @Int $fIndexInt
926
927 * We specialised the call ($fMutPix @Int dIndex)
928 ==> new call ($fSourcePix @Int dIndex)
929 (because Source is a superclass of Mut)
930
931 * We specialised ($fSourcePix @Int dIndex)
932 ==> produces specialised dict $s$fSourcePix,
933 a record with dIndex as a field
934 plus RULE forall d. ($fSourcePix @Int d) = $s$fSourcePix
935 *** This is the bogus step ***
936
937 * Now we decide not to specialise the call
938 $fSourcePix @Int $fIndexInt
939 because we alredy have a RULE that matches it
940
941 * Finally the simplifer rewrites
942 dSource = $fSourcePix @Int $fIndexInt
943 ==> dSource = $s$fSourcePix
944
945 Disaster. Now we have
946
947 Rewrite dSource's RHS to $s$fSourcePix Disaster
948 dSource = $s$fSourcePix
949 dIndex = sc_sel dSource
950 $s$fSourcePix = MkSource dIndex ...
951
952 Solution: filterCalls should have stopped the bogus step,
953 by seeing that dIndex transitively uses $fSourcePix. But
954 it can only do that if it sees all the dict_binds. Wow.
955
956 --------------
957 Here's another example (#13429). Suppose we have
958 class Monoid v => C v a where ...
959
960 We start with a call
961 f @ [Integer] @ Integer $fC[]Integer
962
963 Specialising call to 'f' gives dict bindings
964 $dMonoid_1 :: Monoid [Integer]
965 $dMonoid_1 = M.$p1C @ [Integer] $fC[]Integer
966
967 $dC_1 :: C [Integer] (Node [Integer] Integer)
968 $dC_1 = M.$fCvNode @ [Integer] $dMonoid_1
969
970 ...plus a recursive call to
971 f @ [Integer] @ (Node [Integer] Integer) $dC_1
972
973 Specialising that call gives
974 $dMonoid_2 :: Monoid [Integer]
975 $dMonoid_2 = M.$p1C @ [Integer] $dC_1
976
977 $dC_2 :: C [Integer] (Node [Integer] Integer)
978 $dC_2 = M.$fCvNode @ [Integer] $dMonoid_2
979
980 Now we have two calls to the imported function
981 M.$fCvNode :: Monoid v => C v a
982 M.$fCvNode @v @a m = C m some_fun
983
984 But we must /not/ use the call (M.$fCvNode @ [Integer] $dMonoid_2)
985 for specialisation, else we get:
986
987 $dC_1 = M.$fCvNode @ [Integer] $dMonoid_1
988 $dMonoid_2 = M.$p1C @ [Integer] $dC_1
989 $s$fCvNode = C $dMonoid_2 ...
990 RULE M.$fCvNode [Integer] _ _ = $s$fCvNode
991
992 Now use the rule to rewrite the call in the RHS of $dC_1
993 and we get a loop!
994
995
996 Note [specImport call stack]
997 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
998 When specialising an imports function 'f', we may get new calls
999 of an imported function 'g', which we want to specialise in turn,
1000 and similarly specialising 'g' might expose a new call to 'h'.
1001
1002 We track the stack of enclosing functions. So when specialising 'h' we
1003 haev a specImport call stack of [g,f]. We do this for two reasons:
1004 * Note [Warning about missed specialisations]
1005 * Note [Avoiding recursive specialisation]
1006
1007 Note [Warning about missed specialisations]
1008 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1009 Suppose
1010 * In module Lib, you carefully mark a function 'foo' INLINABLE
1011 * Import Lib(foo) into another module M
1012 * Call 'foo' at some specialised type in M
1013 Then you jolly well expect it to be specialised in M. But what if
1014 'foo' calls another function 'Lib.bar'. Then you'd like 'bar' to be
1015 specialised too. But if 'bar' is not marked INLINABLE it may well
1016 not be specialised. The warning Opt_WarnMissedSpecs warns about this.
1017
1018 It's more noisy to warning about a missed specialisation opportunity
1019 for /every/ overloaded imported function, but sometimes useful. That
1020 is what Opt_WarnAllMissedSpecs does.
1021
1022 ToDo: warn about missed opportunities for local functions.
1023
1024 Note [Avoiding recursive specialisation]
1025 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1026 When we specialise 'f' we may find new overloaded calls to 'g', 'h' in
1027 'f's RHS. So we want to specialise g,h. But we don't want to
1028 specialise f any more! It's possible that f's RHS might have a
1029 recursive yet-more-specialised call, so we'd diverge in that case.
1030 And if the call is to the same type, one specialisation is enough.
1031 Avoiding this recursive specialisation loop is one reason for the
1032 'callers' stack passed to specImports and specImport.
1033
1034 Note [Specialise imported INLINABLE things]
1035 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1036 What imported functions do we specialise? The basic set is
1037 * DFuns and things with INLINABLE pragmas.
1038 but with -fspecialise-aggressively we add
1039 * Anything with an unfolding template
1040
1041 #8874 has a good example of why we want to auto-specialise DFuns.
1042
1043 We have the -fspecialise-aggressively flag (usually off), because we
1044 risk lots of orphan modules from over-vigorous specialisation.
1045 However it's not a big deal: anything non-recursive with an
1046 unfolding-template will probably have been inlined already.
1047
1048 Note [Glom the bindings if imported functions are specialised]
1049 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1050 Suppose we have an imported, *recursive*, INLINABLE function
1051 f :: Eq a => a -> a
1052 f = /\a \d x. ...(f a d)...
1053 In the module being compiled we have
1054 g x = f (x::Int)
1055 Now we'll make a specialised function
1056 f_spec :: Int -> Int
1057 f_spec = \x -> ...(f Int dInt)...
1058 {-# RULE f Int _ = f_spec #-}
1059 g = \x. f Int dInt x
1060 Note that f_spec doesn't look recursive
1061 After rewriting with the RULE, we get
1062 f_spec = \x -> ...(f_spec)...
1063 BUT since f_spec was non-recursive before it'll *stay* non-recursive.
1064 The occurrence analyser never turns a NonRec into a Rec. So we must
1065 make sure that f_spec is recursive. Easiest thing is to make all
1066 the specialisations for imported bindings recursive.
1067
1068
1069
1070 ************************************************************************
1071 * *
1072 \subsubsection{@specExpr@: the main function}
1073 * *
1074 ************************************************************************
1075 -}
1076
1077 data SpecEnv
1078 = SE { se_subst :: Core.Subst
1079 -- We carry a substitution down:
1080 -- a) we must clone any binding that might float outwards,
1081 -- to avoid name clashes
1082 -- b) we carry a type substitution to use when analysing
1083 -- the RHS of specialised bindings (no type-let!)
1084
1085
1086 , se_interesting :: VarSet
1087 -- Dict Ids that we know something about
1088 -- and hence may be worth specialising against
1089 -- See Note [Interesting dictionary arguments]
1090
1091 , se_module :: Module
1092 , se_dflags :: DynFlags
1093 }
1094
1095 instance Outputable SpecEnv where
1096 ppr (SE { se_subst = subst, se_interesting = interesting })
1097 = text "SE" <+> braces (sep $ punctuate comma
1098 [ text "subst =" <+> ppr subst
1099 , text "interesting =" <+> ppr interesting ])
1100
1101 specVar :: SpecEnv -> Id -> CoreExpr
1102 specVar env v = Core.lookupIdSubst (se_subst env) v
1103
1104 specExpr :: SpecEnv -> CoreExpr -> SpecM (CoreExpr, UsageDetails)
1105
1106 ---------------- First the easy cases --------------------
1107 specExpr env (Type ty) = return (Type (substTy env ty), emptyUDs)
1108 specExpr env (Coercion co) = return (Coercion (substCo env co), emptyUDs)
1109 specExpr env (Var v) = return (specVar env v, emptyUDs)
1110 specExpr _ (Lit lit) = return (Lit lit, emptyUDs)
1111 specExpr env (Cast e co)
1112 = do { (e', uds) <- specExpr env e
1113 ; return ((mkCast e' (substCo env co)), uds) }
1114 specExpr env (Tick tickish body)
1115 = do { (body', uds) <- specExpr env body
1116 ; return (Tick (specTickish env tickish) body', uds) }
1117
1118 ---------------- Applications might generate a call instance --------------------
1119 specExpr env expr@(App {})
1120 = go expr []
1121 where
1122 go (App fun arg) args = do (arg', uds_arg) <- specExpr env arg
1123 (fun', uds_app) <- go fun (arg':args)
1124 return (App fun' arg', uds_arg `plusUDs` uds_app)
1125
1126 go (Var f) args = case specVar env f of
1127 Var f' -> return (Var f', mkCallUDs env f' args)
1128 e' -> return (e', emptyUDs) -- I don't expect this!
1129 go other _ = specExpr env other
1130
1131 ---------------- Lambda/case require dumping of usage details --------------------
1132 specExpr env e@(Lam {})
1133 = specLam env' bndrs' body
1134 where
1135 (bndrs, body) = collectBinders e
1136 (env', bndrs') = substBndrs env bndrs
1137 -- More efficient to collect a group of binders together all at once
1138 -- and we don't want to split a lambda group with dumped bindings
1139
1140 specExpr env (Case scrut case_bndr ty alts)
1141 = do { (scrut', scrut_uds) <- specExpr env scrut
1142 ; (scrut'', case_bndr', alts', alts_uds)
1143 <- specCase env scrut' case_bndr alts
1144 ; return (Case scrut'' case_bndr' (substTy env ty) alts'
1145 , scrut_uds `plusUDs` alts_uds) }
1146
1147 ---------------- Finally, let is the interesting case --------------------
1148 specExpr env (Let bind body)
1149 = do { -- Clone binders
1150 (rhs_env, body_env, bind') <- cloneBindSM env bind
1151
1152 -- Deal with the body
1153 ; (body', body_uds) <- specExpr body_env body
1154
1155 -- Deal with the bindings
1156 ; (binds', uds) <- specBind rhs_env bind' body_uds
1157
1158 -- All done
1159 ; return (foldr Let body' binds', uds) }
1160
1161 --------------
1162 specLam :: SpecEnv -> [OutBndr] -> InExpr -> SpecM (OutExpr, UsageDetails)
1163 -- The binders have been substituted, but the body has not
1164 specLam env bndrs body
1165 | null bndrs
1166 = specExpr env body
1167 | otherwise
1168 = do { (body', uds) <- specExpr env body
1169 ; let (free_uds, dumped_dbs) = dumpUDs bndrs uds
1170 ; return (mkLams bndrs (wrapDictBindsE dumped_dbs body'), free_uds) }
1171
1172 --------------
1173 specTickish :: SpecEnv -> CoreTickish -> CoreTickish
1174 specTickish env (Breakpoint ext ix ids)
1175 = Breakpoint ext ix [ id' | id <- ids, Var id' <- [specVar env id]]
1176 -- drop vars from the list if they have a non-variable substitution.
1177 -- should never happen, but it's harmless to drop them anyway.
1178 specTickish _ other_tickish = other_tickish
1179
1180 --------------
1181 specCase :: SpecEnv
1182 -> CoreExpr -- Scrutinee, already done
1183 -> Id -> [CoreAlt]
1184 -> SpecM ( CoreExpr -- New scrutinee
1185 , Id
1186 , [CoreAlt]
1187 , UsageDetails)
1188 specCase env scrut' case_bndr [Alt con args rhs]
1189 | isDictId case_bndr -- See Note [Floating dictionaries out of cases]
1190 , interestingDict env scrut'
1191 , not (isDeadBinder case_bndr && null sc_args')
1192 = do { (case_bndr_flt : sc_args_flt) <- mapM clone_me (case_bndr' : sc_args')
1193
1194 ; let sc_rhss = [ Case (Var case_bndr_flt) case_bndr' (idType sc_arg')
1195 [Alt con args' (Var sc_arg')]
1196 | sc_arg' <- sc_args' ]
1197
1198 -- Extend the substitution for RHS to map the *original* binders
1199 -- to their floated versions.
1200 mb_sc_flts :: [Maybe DictId]
1201 mb_sc_flts = map (lookupVarEnv clone_env) args'
1202 clone_env = zipVarEnv sc_args' sc_args_flt
1203 subst_prs = (case_bndr, Var case_bndr_flt)
1204 : [ (arg, Var sc_flt)
1205 | (arg, Just sc_flt) <- args `zip` mb_sc_flts ]
1206 env_rhs' = env_rhs { se_subst = Core.extendIdSubstList (se_subst env_rhs) subst_prs
1207 , se_interesting = se_interesting env_rhs `extendVarSetList`
1208 (case_bndr_flt : sc_args_flt) }
1209
1210 ; (rhs', rhs_uds) <- specExpr env_rhs' rhs
1211 ; let scrut_bind = mkDB (NonRec case_bndr_flt scrut')
1212 case_bndr_set = unitVarSet case_bndr_flt
1213 sc_binds = [ DB { db_bind = NonRec sc_arg_flt sc_rhs
1214 , db_fvs = case_bndr_set }
1215 | (sc_arg_flt, sc_rhs) <- sc_args_flt `zip` sc_rhss ]
1216 flt_binds = scrut_bind : sc_binds
1217 (free_uds, dumped_dbs) = dumpUDs (case_bndr':args') rhs_uds
1218 all_uds = flt_binds `addDictBinds` free_uds
1219 alt' = Alt con args' (wrapDictBindsE dumped_dbs rhs')
1220 ; return (Var case_bndr_flt, case_bndr', [alt'], all_uds) }
1221 where
1222 (env_rhs, (case_bndr':args')) = substBndrs env (case_bndr:args)
1223 sc_args' = filter is_flt_sc_arg args'
1224
1225 clone_me bndr = do { uniq <- getUniqueM
1226 ; return (mkUserLocalOrCoVar occ uniq wght ty loc) }
1227 where
1228 name = idName bndr
1229 wght = idMult bndr
1230 ty = idType bndr
1231 occ = nameOccName name
1232 loc = getSrcSpan name
1233
1234 arg_set = mkVarSet args'
1235 is_flt_sc_arg var = isId var
1236 && not (isDeadBinder var)
1237 && isDictTy var_ty
1238 && tyCoVarsOfType var_ty `disjointVarSet` arg_set
1239 where
1240 var_ty = idType var
1241
1242
1243 specCase env scrut case_bndr alts
1244 = do { (alts', uds_alts) <- mapAndCombineSM spec_alt alts
1245 ; return (scrut, case_bndr', alts', uds_alts) }
1246 where
1247 (env_alt, case_bndr') = substBndr env case_bndr
1248 spec_alt (Alt con args rhs) = do
1249 (rhs', uds) <- specExpr env_rhs rhs
1250 let (free_uds, dumped_dbs) = dumpUDs (case_bndr' : args') uds
1251 return (Alt con args' (wrapDictBindsE dumped_dbs rhs'), free_uds)
1252 where
1253 (env_rhs, args') = substBndrs env_alt args
1254
1255 {-
1256 Note [Floating dictionaries out of cases]
1257 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1258 Consider
1259 g = \d. case d of { MkD sc ... -> ...(f sc)... }
1260 Naively we can't float d2's binding out of the case expression,
1261 because 'sc' is bound by the case, and that in turn means we can't
1262 specialise f, which seems a pity.
1263
1264 So we invert the case, by floating out a binding
1265 for 'sc_flt' thus:
1266 sc_flt = case d of { MkD sc ... -> sc }
1267 Now we can float the call instance for 'f'. Indeed this is just
1268 what'll happen if 'sc' was originally bound with a let binding,
1269 but case is more efficient, and necessary with equalities. So it's
1270 good to work with both.
1271
1272 You might think that this won't make any difference, because the
1273 call instance will only get nuked by the \d. BUT if 'g' itself is
1274 specialised, then transitively we should be able to specialise f.
1275
1276 In general, given
1277 case e of cb { MkD sc ... -> ...(f sc)... }
1278 we transform to
1279 let cb_flt = e
1280 sc_flt = case cb_flt of { MkD sc ... -> sc }
1281 in
1282 case cb_flt of bg { MkD sc ... -> ....(f sc_flt)... }
1283
1284 The "_flt" things are the floated binds; we use the current substitution
1285 to substitute sc -> sc_flt in the RHS
1286
1287 ************************************************************************
1288 * *
1289 Dealing with a binding
1290 * *
1291 ************************************************************************
1292 -}
1293
1294 specBind :: SpecEnv -- Use this for RHSs
1295 -> CoreBind -- Binders are already cloned by cloneBindSM,
1296 -- but RHSs are un-processed
1297 -> UsageDetails -- Info on how the scope of the binding
1298 -> SpecM ([CoreBind], -- New bindings
1299 UsageDetails) -- And info to pass upstream
1300
1301 -- Returned UsageDetails:
1302 -- No calls for binders of this bind
1303 specBind rhs_env (NonRec fn rhs) body_uds
1304 = do { (rhs', rhs_uds) <- specExpr rhs_env rhs
1305
1306 ; let zapped_fn = zapIdDemandInfo fn
1307 -- We zap the demand info because the binding may float,
1308 -- which would invaidate the demand info (see #17810 for example).
1309 -- Destroying demand info is not terrible; specialisation is
1310 -- always followed soon by demand analysis.
1311 ; (fn', spec_defns, body_uds1) <- specDefn rhs_env body_uds zapped_fn rhs
1312
1313 ; let pairs = spec_defns ++ [(fn', rhs')]
1314 -- fn' mentions the spec_defns in its rules,
1315 -- so put the latter first
1316
1317 combined_uds = body_uds1 `plusUDs` rhs_uds
1318
1319 (free_uds, dump_dbs, float_all) = dumpBindUDs [fn] combined_uds
1320
1321 final_binds :: [DictBind]
1322 -- See Note [From non-recursive to recursive]
1323 final_binds
1324 | not (isEmptyBag dump_dbs)
1325 , not (null spec_defns)
1326 = [recWithDumpedDicts pairs dump_dbs]
1327 | otherwise
1328 = [mkDB $ NonRec b r | (b,r) <- pairs]
1329 ++ bagToList dump_dbs
1330
1331 ; if float_all then
1332 -- Rather than discard the calls mentioning the bound variables
1333 -- we float this (dictionary) binding along with the others
1334 return ([], free_uds `snocDictBinds` final_binds)
1335 else
1336 -- No call in final_uds mentions bound variables,
1337 -- so we can just leave the binding here
1338 return (map db_bind final_binds, free_uds) }
1339
1340
1341 specBind rhs_env (Rec pairs) body_uds
1342 -- Note [Specialising a recursive group]
1343 = do { let (bndrs,rhss) = unzip pairs
1344 ; (rhss', rhs_uds) <- mapAndCombineSM (specExpr rhs_env) rhss
1345 ; let scope_uds = body_uds `plusUDs` rhs_uds
1346 -- Includes binds and calls arising from rhss
1347
1348 ; (bndrs1, spec_defns1, uds1) <- specDefns rhs_env scope_uds pairs
1349
1350 ; (bndrs3, spec_defns3, uds3)
1351 <- if null spec_defns1 -- Common case: no specialisation
1352 then return (bndrs1, [], uds1)
1353 else do { -- Specialisation occurred; do it again
1354 (bndrs2, spec_defns2, uds2)
1355 <- specDefns rhs_env uds1 (bndrs1 `zip` rhss)
1356 ; return (bndrs2, spec_defns2 ++ spec_defns1, uds2) }
1357
1358 ; let (final_uds, dumped_dbs, float_all) = dumpBindUDs bndrs uds3
1359 final_bind = recWithDumpedDicts (spec_defns3 ++ zip bndrs3 rhss')
1360 dumped_dbs
1361
1362 ; if float_all then
1363 return ([], final_uds `snocDictBind` final_bind)
1364 else
1365 return ([db_bind final_bind], final_uds) }
1366
1367
1368 ---------------------------
1369 specDefns :: SpecEnv
1370 -> UsageDetails -- Info on how it is used in its scope
1371 -> [(OutId,InExpr)] -- The things being bound and their un-processed RHS
1372 -> SpecM ([OutId], -- Original Ids with RULES added
1373 [(OutId,OutExpr)], -- Extra, specialised bindings
1374 UsageDetails) -- Stuff to fling upwards from the specialised versions
1375
1376 -- Specialise a list of bindings (the contents of a Rec), but flowing usages
1377 -- upwards binding by binding. Example: { f = ...g ...; g = ...f .... }
1378 -- Then if the input CallDetails has a specialised call for 'g', whose specialisation
1379 -- in turn generates a specialised call for 'f', we catch that in this one sweep.
1380 -- But not vice versa (it's a fixpoint problem).
1381
1382 specDefns _env uds []
1383 = return ([], [], uds)
1384 specDefns env uds ((bndr,rhs):pairs)
1385 = do { (bndrs1, spec_defns1, uds1) <- specDefns env uds pairs
1386 ; (bndr1, spec_defns2, uds2) <- specDefn env uds1 bndr rhs
1387 ; return (bndr1 : bndrs1, spec_defns1 ++ spec_defns2, uds2) }
1388
1389 ---------------------------
1390 specDefn :: SpecEnv
1391 -> UsageDetails -- Info on how it is used in its scope
1392 -> OutId -> InExpr -- The thing being bound and its un-processed RHS
1393 -> SpecM (Id, -- Original Id with added RULES
1394 [(Id,CoreExpr)], -- Extra, specialised bindings
1395 UsageDetails) -- Stuff to fling upwards from the specialised versions
1396
1397 specDefn env body_uds fn rhs
1398 = do { let (body_uds_without_me, calls_for_me) = callsForMe fn body_uds
1399 rules_for_me = idCoreRules fn
1400 ; (rules, spec_defns, spec_uds) <- specCalls False env rules_for_me
1401 calls_for_me fn rhs
1402 ; return ( fn `addIdSpecialisations` rules
1403 , spec_defns
1404 , body_uds_without_me `plusUDs` spec_uds) }
1405 -- It's important that the `plusUDs` is this way
1406 -- round, because body_uds_without_me may bind
1407 -- dictionaries that are used in calls_for_me passed
1408 -- to specDefn. So the dictionary bindings in
1409 -- spec_uds may mention dictionaries bound in
1410 -- body_uds_without_me
1411
1412 ---------------------------
1413 specCalls :: Bool -- True => specialising imported fn
1414 -- False => specialising local fn
1415 -> SpecEnv
1416 -> [CoreRule] -- Existing RULES for the fn
1417 -> [CallInfo]
1418 -> OutId -> InExpr
1419 -> SpecM SpecInfo -- New rules, specialised bindings, and usage details
1420
1421 -- This function checks existing rules, and does not create
1422 -- duplicate ones. So the caller does not need to do this filtering.
1423 -- See 'already_covered'
1424
1425 type SpecInfo = ( [CoreRule] -- Specialisation rules
1426 , [(Id,CoreExpr)] -- Specialised definition
1427 , UsageDetails ) -- Usage details from specialised RHSs
1428
1429 specCalls spec_imp env existing_rules calls_for_me fn rhs
1430 -- The first case is the interesting one
1431 | notNull calls_for_me -- And there are some calls to specialise
1432 && not (isNeverActive (idInlineActivation fn))
1433 -- Don't specialise NOINLINE things
1434 -- See Note [Auto-specialisation and RULES]
1435
1436 -- && not (certainlyWillInline (idUnfolding fn)) -- And it's not small
1437 -- See Note [Inline specialisation] for why we do not
1438 -- switch off specialisation for inline functions
1439
1440 = -- pprTrace "specDefn: some" (ppr fn $$ ppr calls_for_me $$ ppr existing_rules) $
1441 foldlM spec_call ([], [], emptyUDs) calls_for_me
1442
1443 | otherwise -- No calls or RHS doesn't fit our preconceptions
1444 = warnPprTrace (not (exprIsTrivial rhs) && notNull calls_for_me)
1445 (text "Missed specialisation opportunity for" <+> ppr fn $$ _trace_doc) $
1446 -- Note [Specialisation shape]
1447 -- pprTrace "specDefn: none" (ppr fn <+> ppr calls_for_me) $
1448 return ([], [], emptyUDs)
1449 where
1450 _trace_doc = sep [ ppr rhs_bndrs, ppr (idInlineActivation fn) ]
1451
1452 fn_type = idType fn
1453 fn_arity = idArity fn
1454 fn_unf = realIdUnfolding fn -- Ignore loop-breaker-ness here
1455 inl_prag = idInlinePragma fn
1456 inl_act = inlinePragmaActivation inl_prag
1457 is_local = isLocalId fn
1458 is_dfun = isDFunId fn
1459 dflags = se_dflags env
1460 ropts = initRuleOpts dflags
1461 this_mod = se_module env
1462 -- Figure out whether the function has an INLINE pragma
1463 -- See Note [Inline specialisations]
1464
1465 (rhs_bndrs, rhs_body) = collectBindersPushingCo rhs
1466 -- See Note [Account for casts in binding]
1467
1468 in_scope = Core.substInScope (se_subst env)
1469
1470 already_covered :: RuleOpts -> [CoreRule] -> [CoreExpr] -> Bool
1471 already_covered ropts new_rules args -- Note [Specialisations already covered]
1472 = isJust (lookupRule ropts (in_scope, realIdUnfolding)
1473 (const True) fn args
1474 (new_rules ++ existing_rules))
1475 -- NB: we look both in the new_rules (generated by this invocation
1476 -- of specCalls), and in existing_rules (passed in to specCalls)
1477
1478 ----------------------------------------------------------
1479 -- Specialise to one particular call pattern
1480 spec_call :: SpecInfo -- Accumulating parameter
1481 -> CallInfo -- Call instance
1482 -> SpecM SpecInfo
1483 spec_call spec_acc@(rules_acc, pairs_acc, uds_acc) _ci@(CI { ci_key = call_args })
1484 = -- See Note [Specialising Calls]
1485 do { let all_call_args | is_dfun = call_args ++ repeat UnspecArg
1486 | otherwise = call_args
1487 -- See Note [Specialising DFuns]
1488 ; ( useful, rhs_env2, leftover_bndrs
1489 , rule_bndrs, rule_lhs_args
1490 , spec_bndrs1, dx_binds, spec_args) <- specHeader env rhs_bndrs all_call_args
1491
1492 -- ; pprTrace "spec_call" (vcat [ text "call info: " <+> ppr _ci
1493 -- , text "useful: " <+> ppr useful
1494 -- , text "rule_bndrs:" <+> ppr rule_bndrs
1495 -- , text "lhs_args: " <+> ppr rule_lhs_args
1496 -- , text "spec_bndrs:" <+> ppr spec_bndrs1
1497 -- , text "spec_args: " <+> ppr spec_args
1498 -- , text "dx_binds: " <+> ppr dx_binds
1499 -- , text "rhs_env2: " <+> ppr (se_subst rhs_env2)
1500 -- , ppr dx_binds ]) $
1501 -- return ()
1502
1503 ; if not useful -- No useful specialisation
1504 || already_covered ropts rules_acc rule_lhs_args
1505 then return spec_acc
1506 else
1507 do { -- Run the specialiser on the specialised RHS
1508 -- The "1" suffix is before we maybe add the void arg
1509 ; (rhs_body', rhs_uds) <- specExpr rhs_env2 rhs_body
1510 -- Add the { d1' = dx1; d2' = dx2 } usage stuff
1511 -- to the rhs_uds; see Note [Specialising Calls]
1512 ; let rhs_uds_w_dx = foldr consDictBind rhs_uds dx_binds
1513 spec_rhs_bndrs = spec_bndrs1 ++ leftover_bndrs
1514 (spec_uds, dumped_dbs) = dumpUDs spec_rhs_bndrs rhs_uds_w_dx
1515 spec_rhs1 = mkLams spec_rhs_bndrs $
1516 wrapDictBindsE dumped_dbs rhs_body'
1517
1518 spec_fn_ty1 = exprType spec_rhs1
1519
1520 -- Maybe add a void arg to the specialised function,
1521 -- to avoid unlifted bindings
1522 -- See Note [Specialisations Must Be Lifted]
1523 -- C.f. GHC.Core.Opt.WorkWrap.Utils.mkWorkerArgs
1524 add_void_arg = isUnliftedType spec_fn_ty1 && not (isJoinId fn)
1525 (spec_bndrs, spec_rhs, spec_fn_ty)
1526 | add_void_arg = ( voidPrimId : spec_bndrs1
1527 , Lam voidArgId spec_rhs1
1528 , mkVisFunTyMany unboxedUnitTy spec_fn_ty1)
1529 | otherwise = (spec_bndrs1, spec_rhs1, spec_fn_ty1)
1530
1531 join_arity_decr = length rule_lhs_args - length spec_bndrs
1532 spec_join_arity | Just orig_join_arity <- isJoinId_maybe fn
1533 = Just (orig_join_arity - join_arity_decr)
1534 | otherwise
1535 = Nothing
1536
1537 ; spec_fn <- newSpecIdSM fn spec_fn_ty spec_join_arity
1538 ; let
1539 -- The rule to put in the function's specialisation is:
1540 -- forall x @b d1' d2'.
1541 -- f x @T1 @b @T2 d1' d2' = f1 x @b
1542 -- See Note [Specialising Calls]
1543 herald | spec_imp = -- Specialising imported fn
1544 text "SPEC/" <> ppr this_mod
1545 | otherwise = -- Specialising local fn
1546 text "SPEC"
1547
1548 rule_name = mkFastString $ showSDoc dflags $
1549 herald <+> ftext (occNameFS (getOccName fn))
1550 <+> hsep (mapMaybe ppr_call_key_ty call_args)
1551 -- This name ends up in interface files, so use occNameString.
1552 -- Otherwise uniques end up there, making builds
1553 -- less deterministic (See #4012 comment:61 ff)
1554
1555 rule_wout_eta = mkRule
1556 this_mod
1557 True {- Auto generated -}
1558 is_local
1559 rule_name
1560 inl_act -- Note [Auto-specialisation and RULES]
1561 (idName fn)
1562 rule_bndrs
1563 rule_lhs_args
1564 (mkVarApps (Var spec_fn) spec_bndrs)
1565
1566 spec_rule
1567 = case isJoinId_maybe fn of
1568 Just join_arity -> etaExpandToJoinPointRule join_arity rule_wout_eta
1569 Nothing -> rule_wout_eta
1570
1571 simpl_opts = initSimpleOpts dflags
1572
1573 --------------------------------------
1574 -- Add a suitable unfolding if the spec_inl_prag says so
1575 -- See Note [Inline specialisations]
1576 (spec_inl_prag, spec_unf)
1577 | not is_local && isStrongLoopBreaker (idOccInfo fn)
1578 = (neverInlinePragma, noUnfolding)
1579 -- See Note [Specialising imported functions] in "GHC.Core.Opt.OccurAnal"
1580
1581 | isInlinablePragma inl_prag
1582 = (inl_prag { inl_inline = NoUserInlinePrag }, noUnfolding)
1583
1584 | otherwise
1585 = (inl_prag, specUnfolding simpl_opts spec_bndrs spec_unf_body
1586 rule_lhs_args fn_unf)
1587
1588 spec_unf_body body = wrapDictBindsE dumped_dbs $
1589 body `mkApps` spec_args
1590
1591 --------------------------------------
1592 -- Adding arity information just propagates it a bit faster
1593 -- See Note [Arity decrease] in GHC.Core.Opt.Simplify
1594 -- Copy InlinePragma information from the parent Id.
1595 -- So if f has INLINE[1] so does spec_fn
1596 arity_decr = count isValArg rule_lhs_args - count isId spec_bndrs
1597 spec_f_w_arity = spec_fn `setIdArity` max 0 (fn_arity - arity_decr)
1598 `setInlinePragma` spec_inl_prag
1599 `setIdUnfolding` spec_unf
1600 `asJoinId_maybe` spec_join_arity
1601
1602 _rule_trace_doc = vcat [ ppr fn <+> dcolon <+> ppr fn_type
1603 , ppr spec_fn <+> dcolon <+> ppr spec_fn_ty
1604 , ppr rhs_bndrs, ppr call_args
1605 , ppr spec_rule
1606 ]
1607
1608 ; -- pprTrace "spec_call: rule" _rule_trace_doc
1609 return ( spec_rule : rules_acc
1610 , (spec_f_w_arity, spec_rhs) : pairs_acc
1611 , spec_uds `plusUDs` uds_acc
1612 ) } }
1613
1614 {- Note [Specialising DFuns]
1615 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1616 DFuns have a special sort of unfolding (DFunUnfolding), and these are
1617 hard to specialise a DFunUnfolding to give another DFunUnfolding
1618 unless the DFun is fully applied (#18120). So, in the case of DFunIds
1619 we simply extend the CallKey with trailing UnspecArgs, so we'll
1620 generate a rule that completely saturates the DFun.
1621
1622 There is an ASSERT that checks this, in the DFunUnfolding case of
1623 GHC.Core.Unfold.specUnfolding.
1624
1625 Note [Specialisation Must Preserve Sharing]
1626 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1627 Consider a function:
1628
1629 f :: forall a. Eq a => a -> blah
1630 f =
1631 if expensive
1632 then f1
1633 else f2
1634
1635 As written, all calls to 'f' will share 'expensive'. But if we specialise 'f'
1636 at 'Int', eg:
1637
1638 $sfInt = SUBST[a->Int,dict->dEqInt] (if expensive then f1 else f2)
1639
1640 RULE "SPEC f"
1641 forall (d :: Eq Int).
1642 f Int _ = $sfIntf
1643
1644 We've now lost sharing between 'f' and '$sfInt' for 'expensive'. Yikes!
1645
1646 To avoid this, we only generate specialisations for functions whose arity is
1647 enough to bind all of the arguments we need to specialise. This ensures our
1648 specialised functions don't do any work before receiving all of their dicts,
1649 and thus avoids the 'f' case above.
1650
1651 Note [Specialisations Must Be Lifted]
1652 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1653 Consider a function 'f':
1654
1655 f = forall a. Eq a => Array# a
1656
1657 used like
1658
1659 case x of
1660 True -> ...f @Int dEqInt...
1661 False -> 0
1662
1663 Naively, we might generate an (expensive) specialisation
1664
1665 $sfInt :: Array# Int
1666
1667 even in the case that @x = False@! Instead, we add a dummy 'Void#' argument to
1668 the specialisation '$sfInt' ($sfInt :: Void# -> Array# Int) in order to
1669 preserve laziness.
1670
1671 Note [Specialising Calls]
1672 ~~~~~~~~~~~~~~~~~~~~~~~~~
1673 Suppose we have a function with a complicated type:
1674
1675 f :: forall a b c. Int -> Eq a => Show b => c -> Blah
1676 f @a @b @c i dEqA dShowA x = blah
1677
1678 and suppose it is called at:
1679
1680 f 7 @T1 @T2 @T3 dEqT1 ($dfShow dShowT2) t3
1681
1682 This call is described as a 'CallInfo' whose 'ci_key' is:
1683
1684 [ SpecType T1, SpecType T2, UnspecType, UnspecArg, SpecDict dEqT1
1685 , SpecDict ($dfShow dShowT2), UnspecArg ]
1686
1687 Why are 'a' and 'b' identified as 'SpecType', while 'c' is 'UnspecType'?
1688 Because we must specialise the function on type variables that appear
1689 free in its *dictionary* arguments; but not on type variables that do not
1690 appear in any dictionaries, i.e. are fully polymorphic.
1691
1692 Because this call has dictionaries applied, we'd like to specialise
1693 the call on any type argument that appears free in those dictionaries.
1694 In this case, those are [a :-> T1, b :-> T2].
1695
1696 We also need to substitute the dictionary binders with their
1697 specialised dictionaries. The simplest substitution would be
1698 [dEqA :-> dEqT1, dShowA :-> $dfShow dShowT2], but this duplicates
1699 work, since `$dfShow dShowT2` is a function application. Therefore, we
1700 also want to *float the dictionary out* (via bindAuxiliaryDict),
1701 creating a new dict binding
1702
1703 dShow1 = $dfShow dShowT2
1704
1705 and the substitution [dEqA :-> dEqT1, dShowA :-> dShow1].
1706
1707 With the substitutions in hand, we can generate a specialised function:
1708
1709 $sf :: forall c. Int -> c -> Blah
1710 $sf = SUBST[a :-> T1, b :-> T2, dEqA :-> dEqT1, dShowA :-> dShow1] (\@c i x -> blah)
1711
1712 Note that the substitution is applied to the whole thing. This is
1713 convenient, but just slightly fragile. Notably:
1714 * There had better be no name clashes in a/b/c
1715
1716 We must construct a rewrite rule:
1717
1718 RULE "SPEC f @T1 @T2 _"
1719 forall (@c :: Type) (i :: Int) (d1 :: Eq T1) (d2 :: Show T2).
1720 f @T1 @T2 @c i d1 d2 = $sf @c i
1721
1722 In the rule, d1 and d2 are just wildcards, not used in the RHS. Note
1723 additionally that 'x' isn't captured by this rule --- we bind only
1724 enough etas in order to capture all of the *specialised* arguments.
1725
1726 Note [Drop dead args from specialisations]
1727 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1728 When specialising a function, it’s possible some of the arguments may
1729 actually be dead. For example, consider:
1730
1731 f :: forall a. () -> Show a => a -> String
1732 f x y = show y ++ "!"
1733
1734 We might generate the following CallInfo for `f @Int`:
1735
1736 [SpecType Int, UnspecArg, SpecDict $dShowInt, UnspecArg]
1737
1738 Normally we’d include both the x and y arguments in the
1739 specialisation, since we’re not specialising on either of them. But
1740 that’s silly, since x is actually unused! So we might as well drop it
1741 in the specialisation:
1742
1743 $sf :: Int -> String
1744 $sf y = show y ++ "!"
1745
1746 {-# RULE "SPEC f @Int" forall x. f @Int x $dShow = $sf #-}
1747
1748 This doesn’t save us much, since the arg would be removed later by
1749 worker/wrapper, anyway, but it’s easy to do.
1750
1751 Wrinkles
1752
1753 * Note that we only drop dead arguments if:
1754 1. We don’t specialise on them.
1755 2. They come before an argument we do specialise on.
1756 Doing the latter would require eta-expanding the RULE, which could
1757 make it match less often, so it’s not worth it. Doing the former could
1758 be more useful --- it would stop us from generating pointless
1759 specialisations --- but it’s more involved to implement and unclear if
1760 it actually provides much benefit in practice.
1761
1762 * If the function has a stable unfolding, specHeader has to come up with
1763 arguments to pass to that stable unfolding, when building the stable
1764 unfolding of the specialised function: this is the last field in specHeader's
1765 big result tuple.
1766
1767 The right thing to do is to produce a LitRubbish; it should rapidly
1768 disappear. Rather like GHC.Core.Opt.WorkWrap.Utils.mk_absent_let.
1769
1770 Note [Zap occ info in rule binders]
1771 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1772 When we generate a specialisation RULE, we need to drop occurrence
1773 info on the binders. If we don’t, things go wrong when we specialise a
1774 function like
1775
1776 f :: forall a. () -> Show a => a -> String
1777 f x y = show y ++ "!"
1778
1779 since we’ll generate a RULE like
1780
1781 RULE "SPEC f @Int" forall x [Occ=Dead].
1782 f @Int x $dShow = $sf
1783
1784 and Core Lint complains, even though x only appears on the LHS (due to
1785 Note [Drop dead args from specialisations]).
1786
1787 Why is that a Lint error? Because the arguments on the LHS of a rule
1788 are syntactically expressions, not patterns, so Lint treats the
1789 appearance of x as a use rather than a binding. Fortunately, the
1790 solution is simple: we just make sure to zap the occ info before
1791 using ids as wildcard binders in a rule.
1792
1793 Note [Account for casts in binding]
1794 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1795 Consider
1796 f :: Eq a => a -> IO ()
1797 {-# INLINABLE f
1798 StableUnf = (/\a \(d:Eq a) (x:a). blah) |> g
1799 #-}
1800 f = ...
1801
1802 In f's stable unfolding we have done some modest simplification which
1803 has pushed the cast to the outside. (I wonder if this is the Right
1804 Thing, but it's what happens now; see GHC.Core.Opt.Simplify.Utils Note [Casts and
1805 lambdas].) Now that stable unfolding must be specialised, so we want
1806 to push the cast back inside. It would be terrible if the cast
1807 defeated specialisation! Hence the use of collectBindersPushingCo.
1808
1809 Note [Evidence foralls]
1810 ~~~~~~~~~~~~~~~~~~~~~~~~~~
1811 Suppose (#12212) that we are specialising
1812 f :: forall a b. (Num a, F a ~ F b) => blah
1813 with a=b=Int. Then the RULE will be something like
1814 RULE forall (d:Num Int) (g :: F Int ~ F Int).
1815 f Int Int d g = f_spec
1816 But both varToCoreExpr (when constructing the LHS args), and the
1817 simplifier (when simplifying the LHS args), will transform to
1818 RULE forall (d:Num Int) (g :: F Int ~ F Int).
1819 f Int Int d <F Int> = f_spec
1820 by replacing g with Refl. So now 'g' is unbound, which results in a later
1821 crash. So we use Refl right off the bat, and do not forall-quantify 'g':
1822 * varToCoreExpr generates a Refl
1823 * exprsFreeIdsList returns the Ids bound by the args,
1824 which won't include g
1825
1826 You might wonder if this will match as often, but the simplifier replaces
1827 complicated Refl coercions with Refl pretty aggressively.
1828
1829 Note [Orphans and auto-generated rules]
1830 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1831 When we specialise an INLINABLE function, or when we have
1832 -fspecialise-aggressively, we auto-generate RULES that are orphans.
1833 We don't want to warn about these, or we'd generate a lot of warnings.
1834 Thus, we only warn about user-specified orphan rules.
1835
1836 Indeed, we don't even treat the module as an orphan module if it has
1837 auto-generated *rule* orphans. Orphan modules are read every time we
1838 compile, so they are pretty obtrusive and slow down every compilation,
1839 even non-optimised ones. (Reason: for type class instances it's a
1840 type correctness issue.) But specialisation rules are strictly for
1841 *optimisation* only so it's fine not to read the interface.
1842
1843 What this means is that a SPEC rules from auto-specialisation in
1844 module M will be used in other modules only if M.hi has been read for
1845 some other reason, which is actually pretty likely.
1846
1847 Note [From non-recursive to recursive]
1848 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1849 Even in the non-recursive case, if any dict-binds depend on 'fn' we might
1850 have built a recursive knot
1851
1852 f a d x = <blah>
1853 MkUD { ud_binds = NonRec d7 (MkD ..f..)
1854 , ud_calls = ...(f T d7)... }
1855
1856 The we generate
1857
1858 Rec { fs x = <blah>[T/a, d7/d]
1859 f a d x = <blah>
1860 RULE f T _ = fs
1861 d7 = ...f... }
1862
1863 Here the recursion is only through the RULE.
1864
1865 However we definitely should /not/ make the Rec in this wildly common
1866 case:
1867 d = ...
1868 MkUD { ud_binds = NonRec d7 (...d...)
1869 , ud_calls = ...(f T d7)... }
1870
1871 Here we want simply to add d to the floats, giving
1872 MkUD { ud_binds = NonRec d (...)
1873 NonRec d7 (...d...)
1874 , ud_calls = ...(f T d7)... }
1875
1876 In general, we need only make this Rec if
1877 - there are some specialisations (spec_binds non-empty)
1878 - there are some dict_binds that depend on f (dump_dbs non-empty)
1879
1880 Note [Avoiding loops (DFuns)]
1881 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1882 When specialising /dictionary functions/ we must be very careful to
1883 avoid building loops. Here is an example that bit us badly, on
1884 several distinct occasions.
1885
1886 Here is one: #3591
1887 class Eq a => C a
1888 instance Eq [a] => C [a]
1889
1890 This translates to
1891 dfun :: Eq [a] -> C [a]
1892 dfun a d = MkD a d (meth d)
1893
1894 d4 :: Eq [T] = <blah>
1895 d2 :: C [T] = dfun T d4
1896 d1 :: Eq [T] = $p1 d2
1897 d3 :: C [T] = dfun T d1
1898
1899 None of these definitions is recursive. What happened was that we
1900 generated a specialisation:
1901 RULE forall d. dfun T d = dT :: C [T]
1902 dT = (MkD a d (meth d)) [T/a, d1/d]
1903 = MkD T d1 (meth d1)
1904
1905 But now we use the RULE on the RHS of d2, to get
1906 d2 = dT = MkD d1 (meth d1)
1907 d1 = $p1 d2
1908
1909 and now d1 is bottom! The problem is that when specialising 'dfun' we
1910 should first dump "below" the binding all floated dictionary bindings
1911 that mention 'dfun' itself. So d2 and d3 (and hence d1) must be
1912 placed below 'dfun', and thus unavailable to it when specialising
1913 'dfun'. That in turn means that the call (dfun T d1) must be
1914 discarded. On the other hand, the call (dfun T d4) is fine, assuming
1915 d4 doesn't mention dfun.
1916
1917 Solution:
1918 Discard all calls that mention dictionaries that depend
1919 (directly or indirectly) on the dfun we are specialising.
1920 This is done by 'filterCalls'
1921
1922 Note [Avoiding loops (non-DFuns)]
1923 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1924 The whole Note [Avoiding loops (DFuns)] things applies only to DFuns.
1925 It's important /not/ to apply filterCalls to non-DFuns. For example:
1926
1927 class C a where { foo,bar :: [a] -> [a] }
1928
1929 instance C Int where
1930 foo x = r_bar x
1931 bar xs = reverse xs
1932
1933 r_bar :: C a => [a] -> [a]
1934 r_bar xs = bar (xs ++ xs)
1935
1936 That translates to:
1937
1938 r_bar a (c::C a) (xs::[a]) = bar a d (xs ++ xs)
1939
1940 Rec { $fCInt :: C Int = MkC foo_help reverse
1941 foo_help (xs::[Int]) = r_bar Int $fCInt xs }
1942
1943 The call (r_bar $fCInt) mentions $fCInt,
1944 which mentions foo_help,
1945 which mentions r_bar
1946
1947 But we DO want to specialise r_bar at Int:
1948 Rec { $fCInt :: C Int = MkC foo_help reverse
1949 foo_help (xs::[Int]) = r_bar Int $fCInt xs
1950
1951 r_bar a (c::C a) (xs::[a]) = bar a d (xs ++ xs)
1952 RULE r_bar Int _ = r_bar_Int
1953
1954 r_bar_Int xs = bar Int $fCInt (xs ++ xs)
1955 }
1956
1957 Note that, because of its RULE, r_bar joins the recursive
1958 group. (In this case it'll unravel a short moment later.)
1959 See test simplCore/should_compile/T19599a.
1960
1961 Another example is #19599, which looked like this:
1962
1963 class (Show a, Enum a) => MyShow a where
1964 myShow :: a -> String
1965
1966 myShow_impl :: MyShow a => a -> String
1967
1968 foo :: Int -> String
1969 foo = myShow_impl @Int $fMyShowInt
1970
1971 Rec { $fMyShowInt = MkMyShowD $fEnumInt $fShowInt $cmyShow
1972 ; $cmyShow = myShow_impl @Int $fMyShowInt }
1973
1974 Here, we really do want to specialise `myShow_impl @Int $fMyShowInt`.
1975
1976
1977 Note [Specialising a recursive group]
1978 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1979 Consider
1980 let rec { f x = ...g x'...
1981 ; g y = ...f y'.... }
1982 in f 'a'
1983 Here we specialise 'f' at Char; but that is very likely to lead to
1984 a specialisation of 'g' at Char. We must do the latter, else the
1985 whole point of specialisation is lost.
1986
1987 But we do not want to keep iterating to a fixpoint, because in the
1988 presence of polymorphic recursion we might generate an infinite number
1989 of specialisations.
1990
1991 So we use the following heuristic:
1992 * Arrange the rec block in dependency order, so far as possible
1993 (the occurrence analyser already does this)
1994
1995 * Specialise it much like a sequence of lets
1996
1997 * Then go through the block a second time, feeding call-info from
1998 the RHSs back in the bottom, as it were
1999
2000 In effect, the ordering maxmimises the effectiveness of each sweep,
2001 and we do just two sweeps. This should catch almost every case of
2002 monomorphic recursion -- the exception could be a very knotted-up
2003 recursion with multiple cycles tied up together.
2004
2005 This plan is implemented in the Rec case of specBindItself.
2006
2007 Note [Specialisations already covered]
2008 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2009 We obviously don't want to generate two specialisations for the same
2010 argument pattern. There are two wrinkles
2011
2012 1. We do the already-covered test in specDefn, not when we generate
2013 the CallInfo in mkCallUDs. We used to test in the latter place, but
2014 we now iterate the specialiser somewhat, and the Id at the call site
2015 might therefore not have all the RULES that we can see in specDefn
2016
2017 2. What about two specialisations where the second is an *instance*
2018 of the first? If the more specific one shows up first, we'll generate
2019 specialisations for both. If the *less* specific one shows up first,
2020 we *don't* currently generate a specialisation for the more specific
2021 one. (See the call to lookupRule in already_covered.) Reasons:
2022 (a) lookupRule doesn't say which matches are exact (bad reason)
2023 (b) if the earlier specialisation is user-provided, it's
2024 far from clear that we should auto-specialise further
2025
2026 Note [Auto-specialisation and RULES]
2027 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2028 Consider:
2029 g :: Num a => a -> a
2030 g = ...
2031
2032 f :: (Int -> Int) -> Int
2033 f w = ...
2034 {-# RULE f g = 0 #-}
2035
2036 Suppose that auto-specialisation makes a specialised version of
2037 g::Int->Int That version won't appear in the LHS of the RULE for f.
2038 So if the specialisation rule fires too early, the rule for f may
2039 never fire.
2040
2041 It might be possible to add new rules, to "complete" the rewrite system.
2042 Thus when adding
2043 RULE forall d. g Int d = g_spec
2044 also add
2045 RULE f g_spec = 0
2046
2047 But that's a bit complicated. For now we ask the programmer's help,
2048 by *copying the INLINE activation pragma* to the auto-specialised
2049 rule. So if g says {-# NOINLINE[2] g #-}, then the auto-spec rule
2050 will also not be active until phase 2. And that's what programmers
2051 should jolly well do anyway, even aside from specialisation, to ensure
2052 that g doesn't inline too early.
2053
2054 This in turn means that the RULE would never fire for a NOINLINE
2055 thing so not much point in generating a specialisation at all.
2056
2057 Note [Specialisation shape]
2058 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
2059 We only specialise a function if it has visible top-level lambdas
2060 corresponding to its overloading. E.g. if
2061 f :: forall a. Eq a => ....
2062 then its body must look like
2063 f = /\a. \d. ...
2064
2065 Reason: when specialising the body for a call (f ty dexp), we want to
2066 substitute dexp for d, and pick up specialised calls in the body of f.
2067
2068 This doesn't always work. One example I came across was this:
2069 newtype Gen a = MkGen{ unGen :: Int -> a }
2070
2071 choose :: Eq a => a -> Gen a
2072 choose n = MkGen (\r -> n)
2073
2074 oneof = choose (1::Int)
2075
2076 It's a silly example, but we get
2077 choose = /\a. g `cast` co
2078 where choose doesn't have any dict arguments. Thus far I have not
2079 tried to fix this (wait till there's a real example).
2080
2081 Mind you, then 'choose' will be inlined (since RHS is trivial) so
2082 it doesn't matter. This comes up with single-method classes
2083
2084 class C a where { op :: a -> a }
2085 instance C a => C [a] where ....
2086 ==>
2087 $fCList :: C a => C [a]
2088 $fCList = $copList |> (...coercion>...)
2089 ....(uses of $fCList at particular types)...
2090
2091 So we suppress the WARN if the rhs is trivial.
2092
2093 Note [Inline specialisations]
2094 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2095 Here is what we do with the InlinePragma of the original function
2096 * Activation/RuleMatchInfo: both transferred to the
2097 specialised function
2098 * InlineSpec:
2099 (a) An INLINE pragma is transferred
2100 (b) An INLINABLE pragma is *not* transferred
2101
2102 Why (a): transfer INLINE pragmas? The point of INLINE was precisely to
2103 specialise the function at its call site, and arguably that's not so
2104 important for the specialised copies. BUT *pragma-directed*
2105 specialisation now takes place in the typechecker/desugarer, with
2106 manually specified INLINEs. The specialisation here is automatic.
2107 It'd be very odd if a function marked INLINE was specialised (because
2108 of some local use), and then forever after (including importing
2109 modules) the specialised version wasn't INLINEd. After all, the
2110 programmer said INLINE!
2111
2112 You might wonder why we specialise INLINE functions at all. After
2113 all they should be inlined, right? Two reasons:
2114
2115 * Even INLINE functions are sometimes not inlined, when they aren't
2116 applied to interesting arguments. But perhaps the type arguments
2117 alone are enough to specialise (even though the args are too boring
2118 to trigger inlining), and it's certainly better to call the
2119 specialised version.
2120
2121 * The RHS of an INLINE function might call another overloaded function,
2122 and we'd like to generate a specialised version of that function too.
2123 This actually happens a lot. Consider
2124 replicateM_ :: (Monad m) => Int -> m a -> m ()
2125 {-# INLINABLE replicateM_ #-}
2126 replicateM_ d x ma = ...
2127 The strictness analyser may transform to
2128 replicateM_ :: (Monad m) => Int -> m a -> m ()
2129 {-# INLINE replicateM_ #-}
2130 replicateM_ d x ma = case x of I# x' -> $wreplicateM_ d x' ma
2131
2132 $wreplicateM_ :: (Monad m) => Int# -> m a -> m ()
2133 {-# INLINABLE $wreplicateM_ #-}
2134 $wreplicateM_ = ...
2135 Now an importing module has a specialised call to replicateM_, say
2136 (replicateM_ dMonadIO). We certainly want to specialise $wreplicateM_!
2137 This particular example had a huge effect on the call to replicateM_
2138 in nofib/shootout/n-body.
2139
2140 Why (b): discard INLINABLE pragmas? See #4874 for persuasive examples.
2141 Suppose we have
2142 {-# INLINABLE f #-}
2143 f :: Ord a => [a] -> Int
2144 f xs = letrec f' = ...f'... in f'
2145 Then, when f is specialised and optimised we might get
2146 wgo :: [Int] -> Int#
2147 wgo = ...wgo...
2148 f_spec :: [Int] -> Int
2149 f_spec xs = case wgo xs of { r -> I# r }
2150 and we clearly want to inline f_spec at call sites. But if we still
2151 have the big, un-optimised of f (albeit specialised) captured in an
2152 INLINABLE pragma for f_spec, we won't get that optimisation.
2153
2154 So we simply drop INLINABLE pragmas when specialising. It's not really
2155 a complete solution; ignoring specialisation for now, INLINABLE functions
2156 don't get properly strictness analysed, for example. But it works well
2157 for examples involving specialisation, which is the dominant use of
2158 INLINABLE. See #4874.
2159 -}
2160
2161 {- *********************************************************************
2162 * *
2163 SpecArg, and specHeader
2164 * *
2165 ********************************************************************* -}
2166
2167 -- | An argument that we might want to specialise.
2168 -- See Note [Specialising Calls] for the nitty gritty details.
2169 data SpecArg
2170 =
2171 -- | Type arguments that should be specialised, due to appearing
2172 -- free in the type of a 'SpecDict'.
2173 SpecType Type
2174
2175 -- | Type arguments that should remain polymorphic.
2176 | UnspecType
2177
2178 -- | Dictionaries that should be specialised. mkCallUDs ensures
2179 -- that only "interesting" dictionary arguments get a SpecDict;
2180 -- see Note [Interesting dictionary arguments]
2181 | SpecDict DictExpr
2182
2183 -- | Value arguments that should not be specialised.
2184 | UnspecArg
2185
2186 instance Outputable SpecArg where
2187 ppr (SpecType t) = text "SpecType" <+> ppr t
2188 ppr UnspecType = text "UnspecType"
2189 ppr (SpecDict d) = text "SpecDict" <+> ppr d
2190 ppr UnspecArg = text "UnspecArg"
2191
2192 specArgFreeIds :: SpecArg -> IdSet
2193 specArgFreeIds (SpecType {}) = emptyVarSet
2194 specArgFreeIds (SpecDict dx) = exprFreeIds dx
2195 specArgFreeIds UnspecType = emptyVarSet
2196 specArgFreeIds UnspecArg = emptyVarSet
2197
2198 isSpecDict :: SpecArg -> Bool
2199 isSpecDict (SpecDict {}) = True
2200 isSpecDict _ = False
2201
2202 -- | Given binders from an original function 'f', and the 'SpecArg's
2203 -- corresponding to its usage, compute everything necessary to build
2204 -- a specialisation.
2205 --
2206 -- We will use the running example from Note [Specialising Calls]:
2207 --
2208 -- f :: forall a b c. Int -> Eq a => Show b => c -> Blah
2209 -- f @a @b @c i dEqA dShowA x = blah
2210 --
2211 -- Suppose we decide to specialise it at the following pattern:
2212 --
2213 -- [ SpecType T1, SpecType T2, UnspecType, UnspecArg
2214 -- , SpecDict dEqT1, SpecDict ($dfShow dShowT2), UnspecArg ]
2215 --
2216 -- We'd eventually like to build the RULE
2217 --
2218 -- RULE "SPEC f @T1 @T2 _"
2219 -- forall (@c :: Type) (i :: Int) (d1 :: Eq T1) (d2 :: Show T2).
2220 -- f @T1 @T2 @c i d1 d2 = $sf @c i
2221 --
2222 -- and the specialisation '$sf'
2223 --
2224 -- $sf :: forall c. Int -> c -> Blah
2225 -- $sf = SUBST[a :-> T1, b :-> T2, dEqA :-> dEqT1, dShowA :-> dShow1] (\@c i x -> blah)
2226 --
2227 -- where dShow1 is a floated binding created by bindAuxiliaryDict.
2228 --
2229 -- The cases for 'specHeader' below are presented in the same order as this
2230 -- running example. The result of 'specHeader' for this example is as follows:
2231 --
2232 -- ( -- Returned arguments
2233 -- env + [a :-> T1, b :-> T2, dEqA :-> dEqT1, dShowA :-> dShow1]
2234 -- , [x]
2235 --
2236 -- -- RULE helpers
2237 -- , [c, i, d1, d2]
2238 -- , [T1, T2, c, i, d1, d2]
2239 --
2240 -- -- Specialised function helpers
2241 -- , [c, i, x]
2242 -- , [dShow1 = $dfShow dShowT2]
2243 -- , [T1, T2, c, i, dEqT1, dShow1]
2244 -- )
2245 specHeader
2246 :: SpecEnv
2247 -> [InBndr] -- The binders from the original function 'f'
2248 -> [SpecArg] -- From the CallInfo
2249 -> SpecM ( Bool -- True <=> some useful specialisation happened
2250 -- Not the same as any (isSpecDict args) because
2251 -- the args might be longer than bndrs
2252
2253 -- Returned arguments
2254 , SpecEnv -- Substitution to apply to the body of 'f'
2255 , [OutBndr] -- Leftover binders from the original function 'f'
2256 -- that don’t have a corresponding SpecArg
2257
2258 -- RULE helpers
2259 , [OutBndr] -- Binders for the RULE
2260 , [OutExpr] -- Args for the LHS of the rule
2261
2262 -- Specialised function helpers
2263 , [OutBndr] -- Binders for $sf
2264 , [DictBind] -- Auxiliary dictionary bindings
2265 , [OutExpr] -- Specialised arguments for unfolding
2266 -- Same length as "Args for LHS of rule"
2267 )
2268
2269 -- We want to specialise on type 'T1', and so we must construct a substitution
2270 -- 'a->T1', as well as a LHS argument for the resulting RULE and unfolding
2271 -- details.
2272 specHeader env (bndr : bndrs) (SpecType ty : args)
2273 = do { let in_scope = Core.substInScope (se_subst env)
2274 qvars = scopedSort $
2275 filterOut (`elemInScopeSet` in_scope) $
2276 tyCoVarsOfTypeList ty
2277 (env1, qvars') = substBndrs env qvars
2278 ty' = substTy env1 ty
2279 env2 = extendTvSubstList env1 [(bndr, ty')]
2280 ; (useful, env3, leftover_bndrs, rule_bs, rule_es, bs', dx, spec_args)
2281 <- specHeader env2 bndrs args
2282 ; pure ( useful
2283 , env3
2284 , leftover_bndrs
2285 , qvars' ++ rule_bs
2286 , Type ty' : rule_es
2287 , qvars' ++ bs'
2288 , dx
2289 , Type ty' : spec_args
2290 )
2291 }
2292
2293 -- Next we have a type that we don't want to specialise. We need to perform
2294 -- a substitution on it (in case the type refers to 'a'). Additionally, we need
2295 -- to produce a binder, LHS argument and RHS argument for the resulting rule,
2296 -- /and/ a binder for the specialised body.
2297 specHeader env (bndr : bndrs) (UnspecType : args)
2298 = do { let (env', bndr') = substBndr env bndr
2299 ; (useful, env'', leftover_bndrs, rule_bs, rule_es, bs', dx, spec_args)
2300 <- specHeader env' bndrs args
2301 ; pure ( useful
2302 , env''
2303 , leftover_bndrs
2304 , bndr' : rule_bs
2305 , varToCoreExpr bndr' : rule_es
2306 , bndr' : bs'
2307 , dx
2308 , varToCoreExpr bndr' : spec_args
2309 )
2310 }
2311
2312 -- Next we want to specialise the 'Eq a' dict away. We need to construct
2313 -- a wildcard binder to match the dictionary (See Note [Specialising Calls] for
2314 -- the nitty-gritty), as a LHS rule and unfolding details.
2315 specHeader env (bndr : bndrs) (SpecDict d : args)
2316 = do { bndr' <- newDictBndr env bndr -- See Note [Zap occ info in rule binders]
2317 ; let (env', dx_bind, spec_dict) = bindAuxiliaryDict env bndr bndr' d
2318 ; (_, env'', leftover_bndrs, rule_bs, rule_es, bs', dx, spec_args)
2319 <- specHeader env' bndrs args
2320 ; pure ( True -- Ha! A useful specialisation!
2321 , env''
2322 , leftover_bndrs
2323 -- See Note [Evidence foralls]
2324 , exprFreeIdsList (varToCoreExpr bndr') ++ rule_bs
2325 , varToCoreExpr bndr' : rule_es
2326 , bs'
2327 , maybeToList dx_bind ++ dx
2328 , spec_dict : spec_args
2329 )
2330 }
2331
2332 -- Finally, we have the unspecialised argument 'i'. We need to produce
2333 -- a binder, LHS and RHS argument for the RULE, and a binder for the
2334 -- specialised body.
2335 --
2336 -- NB: Calls to 'specHeader' will trim off any trailing 'UnspecArg's, which is
2337 -- why 'i' doesn't appear in our RULE above. But we have no guarantee that
2338 -- there aren't 'UnspecArg's which come /before/ all of the dictionaries, so
2339 -- this case must be here.
2340 specHeader env (bndr : bndrs) (UnspecArg : args)
2341 = do { -- see Note [Zap occ info in rule binders]
2342 let (env', bndr') = substBndr env (zapIdOccInfo bndr)
2343 ; (useful, env'', leftover_bndrs, rule_bs, rule_es, bs', dx, spec_args)
2344 <- specHeader env' bndrs args
2345
2346 ; let bndr_ty = idType bndr'
2347
2348 -- See Note [Drop dead args from specialisations]
2349 -- C.f. GHC.Core.Opt.WorkWrap.Utils.mk_absent_let
2350 (mb_spec_bndr, spec_arg)
2351 | isDeadBinder bndr
2352 , Just lit_expr <- mkLitRubbish bndr_ty
2353 = (Nothing, lit_expr)
2354 | otherwise
2355 = (Just bndr', varToCoreExpr bndr')
2356
2357 ; pure ( useful
2358 , env''
2359 , leftover_bndrs
2360 , bndr' : rule_bs
2361 , varToCoreExpr bndr' : rule_es
2362 , case mb_spec_bndr of
2363 Just b' -> b' : bs'
2364 Nothing -> bs'
2365 , dx
2366 , spec_arg : spec_args
2367 )
2368 }
2369
2370 -- If we run out of binders, stop immediately
2371 -- See Note [Specialisation Must Preserve Sharing]
2372 specHeader env [] _ = pure (False, env, [], [], [], [], [], [])
2373
2374 -- Return all remaining binders from the original function. These have the
2375 -- invariant that they should all correspond to unspecialised arguments, so
2376 -- it's safe to stop processing at this point.
2377 specHeader env bndrs []
2378 = pure (False, env', bndrs', [], [], [], [], [])
2379 where
2380 (env', bndrs') = substBndrs env bndrs
2381
2382
2383 -- | Binds a dictionary argument to a fresh name, to preserve sharing
2384 bindAuxiliaryDict
2385 :: SpecEnv
2386 -> InId -> OutId -> OutExpr -- Original dict binder, and the witnessing expression
2387 -> ( SpecEnv -- Substitute for orig_dict_id
2388 , Maybe DictBind -- Auxiliary dict binding, if any
2389 , OutExpr) -- Witnessing expression (always trivial)
2390 bindAuxiliaryDict env@(SE { se_subst = subst, se_interesting = interesting })
2391 orig_dict_id fresh_dict_id dict_expr
2392
2393 -- If the dictionary argument is trivial,
2394 -- don’t bother creating a new dict binding; just substitute
2395 | Just dict_id <- getIdFromTrivialExpr_maybe dict_expr
2396 = let env' = env { se_subst = Core.extendSubst subst orig_dict_id dict_expr
2397 `Core.extendInScope` dict_id
2398 -- See Note [Keep the old dictionaries interesting]
2399 , se_interesting = interesting `extendVarSet` dict_id }
2400 in (env', Nothing, dict_expr)
2401
2402 | otherwise -- Non-trivial dictionary arg; make an auxiliary binding
2403 = let dict_bind = mkDB (NonRec fresh_dict_id dict_expr)
2404 env' = env { se_subst = Core.extendSubst subst orig_dict_id (Var fresh_dict_id)
2405 `Core.extendInScope` fresh_dict_id
2406 -- See Note [Make the new dictionaries interesting]
2407 , se_interesting = interesting `extendVarSet` fresh_dict_id }
2408 in (env', Just dict_bind, Var fresh_dict_id)
2409
2410 {-
2411 Note [Make the new dictionaries interesting]
2412 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2413 Important! We're going to substitute dx_id1 for d
2414 and we want it to look "interesting", else we won't gather *any*
2415 consequential calls. E.g.
2416 f d = ...g d....
2417 If we specialise f for a call (f (dfun dNumInt)), we'll get
2418 a consequent call (g d') with an auxiliary definition
2419 d' = df dNumInt
2420 We want that consequent call to look interesting
2421
2422 Note [Keep the old dictionaries interesting]
2423 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2424 In bindAuxiliaryDict, we don’t bother creating a new dict binding if
2425 the dict expression is trivial. For example, if we have
2426
2427 f = \ @m1 (d1 :: Monad m1) -> ...
2428
2429 and we specialize it at the pattern
2430
2431 [SpecType IO, SpecArg $dMonadIO]
2432
2433 it would be silly to create a new binding for $dMonadIO; it’s already
2434 a binding! So we just extend the substitution directly:
2435
2436 m1 :-> IO
2437 d1 :-> $dMonadIO
2438
2439 But this creates a new subtlety: the dict expression might be a dict
2440 binding we floated out while specializing another function. For
2441 example, we might have
2442
2443 d2 = $p1Monad $dMonadIO -- floated out by bindAuxiliaryDict
2444 $sg = h @IO d2
2445 h = \ @m2 (d2 :: Applicative m2) -> ...
2446
2447 and end up specializing h at the following pattern:
2448
2449 [SpecType IO, SpecArg d2]
2450
2451 When we created the d2 binding in the first place, we locally marked
2452 it as interesting while specializing g as described above by
2453 Note [Make the new dictionaries interesting]. But when we go to
2454 specialize h, it isn’t in the SpecEnv anymore, so we’ve lost the
2455 knowledge that we should specialize on it.
2456
2457 To fix this, we have to explicitly add d2 *back* to the interesting
2458 set. That way, it will still be considered interesting while
2459 specializing the body of h. See !2913.
2460 -}
2461
2462
2463 {- *********************************************************************
2464 * *
2465 UsageDetails and suchlike
2466 * *
2467 ********************************************************************* -}
2468
2469 data UsageDetails
2470 = MkUD {
2471 ud_binds :: !(Bag DictBind),
2472 -- See Note [Floated dictionary bindings]
2473 -- The order is important;
2474 -- in ds1 `union` ds2, bindings in ds2 can depend on those in ds1
2475 -- (Remember, Bags preserve order in GHC.)
2476
2477 ud_calls :: !CallDetails
2478
2479 -- INVARIANT: suppose bs = bindersOf ud_binds
2480 -- Then 'calls' may *mention* 'bs',
2481 -- but there should be no calls *for* bs
2482 }
2483
2484 -- | A 'DictBind' is a binding along with a cached set containing its free
2485 -- variables (both type variables and dictionaries)
2486 data DictBind = DB { db_bind :: CoreBind, db_fvs :: VarSet }
2487
2488 {- Note [Floated dictionary bindings]
2489 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2490 We float out dictionary bindings for the reasons described under
2491 "Dictionary floating" above. But not /just/ dictionary bindings.
2492 Consider
2493
2494 f :: Eq a => blah
2495 f a d = rhs
2496
2497 $c== :: T -> T -> Bool
2498 $c== x y = ...
2499
2500 $df :: Eq T
2501 $df = Eq $c== ...
2502
2503 gurgle = ...(f @T $df)...
2504
2505 We gather the call info for (f @T $df), and we don't want to drop it
2506 when we come across the binding for $df. So we add $df to the floats
2507 and continue. But then we have to add $c== to the floats, and so on.
2508 These all float above the binding for 'f', and now we can
2509 successfully specialise 'f'.
2510
2511 So the DictBinds in (ud_binds :: Bag DictBind) may contain
2512 non-dictionary bindings too.
2513
2514 Note [Specialising polymorphic dictionaries]
2515 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2516 Consider
2517 class M a where { foo :: a -> Int }
2518
2519 instance M (ST s) where ...
2520 -- dMST :: forall s. M (ST s)
2521
2522 wimwam :: forall a. M a => a -> Int
2523 wimwam = /\a \(d::M a). body
2524
2525 f :: ST s -> Int
2526 f = /\s \(x::ST s). wimwam @(ST s) (dMST @s) dx + 1
2527
2528 We'd like to specialise wimwam at (ST s), thus
2529 $swimwam :: forall s. ST s -> Int
2530 $swimwam = /\s. body[ST s/a, (dMST @s)/d]
2531
2532 RULE forall s (d :: M (ST s)).
2533 wimwam @(ST s) d = $swimwam @s
2534
2535 Here are the moving parts:
2536
2537 * We must /not/ dump the CallInfo
2538 CIS wimwam (CI { ci_key = [@(ST s), dMST @s]
2539 , ci_fvs = {dMST} })
2540 when we come to the /\s. Instead, we simply let it continue to float
2541 upwards. Hence ci_fvs is an IdSet, listing the /Ids/ that
2542 are free in the call, but not the /TyVars/. Hence using specArgFreeIds
2543 in singleCall.
2544
2545 NB to be fully kosher we should explicitly quantifying the CallInfo
2546 over 's', but we don't bother. This would matter if there was an
2547 enclosing binding of the same 's', which I don't expect to happen.
2548
2549 * Whe we come to specialise the call, we must remember to quantify
2550 over 's'. That is done in the SpecType case of specHeader, where
2551 we add 's' (called qvars) to the binders of the RULE and the specialised
2552 function.
2553
2554 * If we have f :: forall m. Monoid m => blah, and two calls
2555 (f @(Endo b) (d :: Monoid (Endo b))
2556 (f @(Endo (c->c)) (d :: Monoid (Endo (c->c)))
2557 we want to generate a specialisation only for the first. The second
2558 is just a substitution instance of the first, with no greater specialisation.
2559 Hence the call to `remove_dups` in `filterCalls`.
2560
2561 All this arose in #13873, in the unexpected form that a SPECIALISE
2562 pragma made the program slower! The reason was that the specialised
2563 function $sinsertWith arising from the pragma looked rather like `f`
2564 above, and failed to specialise a call in its body like wimwam.
2565 Without the pragma, the original call to `insertWith` was completely
2566 monomorphic, and specialised in one go.
2567 -}
2568
2569 instance Outputable DictBind where
2570 ppr (DB { db_bind = bind, db_fvs = fvs })
2571 = text "DB" <+> braces (sep [ text "bind:" <+> ppr bind
2572 , text "fvs: " <+> ppr fvs ])
2573
2574 instance Outputable UsageDetails where
2575 ppr (MkUD { ud_binds = dbs, ud_calls = calls })
2576 = text "MkUD" <+> braces (sep (punctuate comma
2577 [text "binds" <+> equals <+> ppr dbs,
2578 text "calls" <+> equals <+> ppr calls]))
2579
2580 emptyUDs :: UsageDetails
2581 emptyUDs = MkUD { ud_binds = emptyBag, ud_calls = emptyDVarEnv }
2582
2583 ------------------------------------------------------------
2584 type CallDetails = DIdEnv CallInfoSet
2585 -- The order of specialized binds and rules depends on how we linearize
2586 -- CallDetails, so to get determinism we must use a deterministic set here.
2587 -- See Note [Deterministic UniqFM] in GHC.Types.Unique.DFM
2588
2589 data CallInfoSet = CIS Id (Bag CallInfo)
2590 -- The list of types and dictionaries is guaranteed to
2591 -- match the type of f
2592 -- The Bag may contain duplicate calls (i.e. f @T and another f @T)
2593 -- These dups are eliminated by already_covered in specCalls
2594
2595 data CallInfo
2596 = CI { ci_key :: [SpecArg] -- All arguments
2597 , ci_fvs :: IdSet -- Free Ids of the ci_key call
2598 -- _not_ including the main id itself, of course
2599 -- NB: excluding tyvars:
2600 -- See Note [Specialising polymorphic dictionaries]
2601 }
2602
2603 type DictExpr = CoreExpr
2604
2605 ciSetFilter :: (CallInfo -> Bool) -> CallInfoSet -> CallInfoSet
2606 ciSetFilter p (CIS id a) = CIS id (filterBag p a)
2607
2608 instance Outputable CallInfoSet where
2609 ppr (CIS fn map) = hang (text "CIS" <+> ppr fn)
2610 2 (ppr map)
2611
2612 pprCallInfo :: Id -> CallInfo -> SDoc
2613 pprCallInfo fn (CI { ci_key = key })
2614 = ppr fn <+> ppr key
2615
2616 ppr_call_key_ty :: SpecArg -> Maybe SDoc
2617 ppr_call_key_ty (SpecType ty) = Just $ char '@' <> pprParendType ty
2618 ppr_call_key_ty UnspecType = Just $ char '_'
2619 ppr_call_key_ty (SpecDict _) = Nothing
2620 ppr_call_key_ty UnspecArg = Nothing
2621
2622 instance Outputable CallInfo where
2623 ppr (CI { ci_key = key, ci_fvs = _fvs })
2624 = text "CI" <> braces (sep (map ppr key))
2625
2626 unionCalls :: CallDetails -> CallDetails -> CallDetails
2627 unionCalls c1 c2 = plusDVarEnv_C unionCallInfoSet c1 c2
2628
2629 unionCallInfoSet :: CallInfoSet -> CallInfoSet -> CallInfoSet
2630 unionCallInfoSet (CIS f calls1) (CIS _ calls2) =
2631 CIS f (calls1 `unionBags` calls2)
2632
2633 callDetailsFVs :: CallDetails -> VarSet
2634 callDetailsFVs calls =
2635 nonDetStrictFoldUDFM (unionVarSet . callInfoFVs) emptyVarSet calls
2636 -- It's OK to use nonDetStrictFoldUDFM here because we forget the ordering
2637 -- immediately by converting to a nondeterministic set.
2638
2639 callInfoFVs :: CallInfoSet -> VarSet
2640 callInfoFVs (CIS _ call_info) =
2641 foldr (\(CI { ci_fvs = fv }) vs -> unionVarSet fv vs) emptyVarSet call_info
2642
2643 getTheta :: [TyCoBinder] -> [PredType]
2644 getTheta = fmap tyBinderType . filter isInvisibleBinder . filter (not . isNamedBinder)
2645
2646
2647 ------------------------------------------------------------
2648 singleCall :: Id -> [SpecArg] -> UsageDetails
2649 singleCall id args
2650 = MkUD {ud_binds = emptyBag,
2651 ud_calls = unitDVarEnv id $ CIS id $
2652 unitBag (CI { ci_key = args -- used to be tys
2653 , ci_fvs = call_fvs }) }
2654 where
2655 call_fvs = foldr (unionVarSet . specArgFreeIds) emptyVarSet args
2656 -- The type args (tys) are guaranteed to be part of the dictionary
2657 -- types, because they are just the constrained types,
2658 -- and the dictionary is therefore sure to be bound
2659 -- inside the binding for any type variables free in the type;
2660 -- hence it's safe to neglect tyvars free in tys when making
2661 -- the free-var set for this call
2662 -- BUT I don't trust this reasoning; play safe and include tys_fvs
2663 --
2664 -- We don't include the 'id' itself.
2665
2666 mkCallUDs, mkCallUDs' :: SpecEnv -> Id -> [CoreExpr] -> UsageDetails
2667 mkCallUDs env f args
2668 = -- pprTrace "mkCallUDs" (vcat [ ppr f, ppr args, ppr res ])
2669 res
2670 where
2671 res = mkCallUDs' env f args
2672
2673 mkCallUDs' env f args
2674 | wantCallsFor env f -- We want it, and...
2675 , not (null ci_key) -- this call site has a useful specialisation
2676 = -- pprTrace "mkCallUDs: keeping" _trace_doc
2677 singleCall f ci_key
2678
2679 | otherwise -- See also Note [Specialisations already covered]
2680 = -- pprTrace "mkCallUDs: discarding" _trace_doc
2681 emptyUDs
2682
2683 where
2684 _trace_doc = vcat [ppr f, ppr args, ppr ci_key]
2685 pis = fst $ splitPiTys $ idType f
2686 constrained_tyvars = tyCoVarsOfTypes $ getTheta pis
2687
2688 ci_key :: [SpecArg]
2689 ci_key = dropWhileEndLE (not . isSpecDict) $
2690 zipWith mk_spec_arg args pis
2691 -- Drop trailing args until we get to a SpecDict
2692 -- In this way the RULE has as few args as possible,
2693 -- which broadens its applicability, since rules only
2694 -- fire when saturated
2695
2696 mk_spec_arg :: CoreExpr -> TyCoBinder -> SpecArg
2697 mk_spec_arg arg (Named bndr)
2698 | binderVar bndr `elemVarSet` constrained_tyvars
2699 = case arg of
2700 Type ty -> SpecType ty
2701 _ -> pprPanic "ci_key" $ ppr arg
2702 | otherwise = UnspecType
2703
2704 -- For "InvisArg", which are the type-class dictionaries,
2705 -- we decide on a case by case basis if we want to specialise
2706 -- on this argument; if so, SpecDict, if not UnspecArg
2707 mk_spec_arg arg (Anon InvisArg pred)
2708 | not (isIPLikePred (scaledThing pred))
2709 -- See Note [Type determines value]
2710 , interestingDict env arg
2711 -- See Note [Interesting dictionary arguments]
2712 = SpecDict arg
2713
2714 | otherwise = UnspecArg
2715
2716 mk_spec_arg _ (Anon VisArg _)
2717 = UnspecArg
2718
2719 wantCallsFor :: SpecEnv -> Id -> Bool
2720 wantCallsFor _env _f = True
2721 -- We could reduce the size of the UsageDetails by being less eager
2722 -- about collecting calls for LocalIds: there is no point for
2723 -- ones that are lambda-bound. We can't decide this by looking at
2724 -- the (absence of an) unfolding, because unfoldings for local
2725 -- functions are discarded by cloneBindSM, so no local binder will
2726 -- have an unfolding at this stage. We'd have to keep a candidate
2727 -- set of let-binders.
2728 --
2729 -- Not many lambda-bound variables have dictionary arguments, so
2730 -- this would make little difference anyway.
2731 --
2732 -- For imported Ids we could check for an unfolding, but we have to
2733 -- do so anyway in canSpecImport, and it seems better to have it
2734 -- all in one place. So we simply collect usage info for imported
2735 -- overloaded functions.
2736
2737 {- Note [Type determines value]
2738 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2739 Only specialise on non-impicit-parameter predicates, because these
2740 are the ones whose *type* determines their *value*. In particular,
2741 with implicit params, the type args *don't* say what the value of the
2742 implicit param is! See #7101.
2743
2744 So we treat implicit params just like ordinary arguments for the
2745 purposes of specialisation. Note that we still want to specialise
2746 functions with implicit params if they have *other* dicts which are
2747 class params; see #17930.
2748
2749 Note [Interesting dictionary arguments]
2750 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2751 Consider this
2752 \a.\d:Eq a. let f = ... in ...(f d)...
2753 There really is not much point in specialising f wrt the dictionary d,
2754 because the code for the specialised f is not improved at all, because
2755 d is lambda-bound. We simply get junk specialisations.
2756
2757 What is "interesting"? Just that it has *some* structure. But what about
2758 variables?
2759
2760 * A variable might be imported, in which case its unfolding
2761 will tell us whether it has useful structure
2762
2763 * Local variables are cloned on the way down (to avoid clashes when
2764 we float dictionaries), and cloning drops the unfolding
2765 (cloneIdBndr). Moreover, we make up some new bindings, and it's a
2766 nuisance to give them unfoldings. So we keep track of the
2767 "interesting" dictionaries as a VarSet in SpecEnv.
2768 We have to take care to put any new interesting dictionary
2769 bindings in the set.
2770
2771 We accidentally lost accurate tracking of local variables for a long
2772 time, because cloned variables don't have unfoldings. But makes a
2773 massive difference in a few cases, eg #5113. For nofib as a
2774 whole it's only a small win: 2.2% improvement in allocation for ansi,
2775 1.2% for bspt, but mostly 0.0! Average 0.1% increase in binary size.
2776 -}
2777
2778 interestingDict :: SpecEnv -> CoreExpr -> Bool
2779 -- A dictionary argument is interesting if it has *some* structure
2780 -- NB: "dictionary" arguments include constraints of all sorts,
2781 -- including equality constraints; hence the Coercion case
2782 interestingDict env (Var v) = hasSomeUnfolding (idUnfolding v)
2783 || isDataConWorkId v
2784 || v `elemVarSet` se_interesting env
2785 interestingDict _ (Type _) = False
2786 interestingDict _ (Coercion _) = False
2787 interestingDict env (App fn (Type _)) = interestingDict env fn
2788 interestingDict env (App fn (Coercion _)) = interestingDict env fn
2789 interestingDict env (Tick _ a) = interestingDict env a
2790 interestingDict env (Cast e _) = interestingDict env e
2791 interestingDict _ _ = True
2792
2793 plusUDs :: UsageDetails -> UsageDetails -> UsageDetails
2794 plusUDs (MkUD {ud_binds = db1, ud_calls = calls1})
2795 (MkUD {ud_binds = db2, ud_calls = calls2})
2796 = MkUD { ud_binds = db1 `unionBags` db2
2797 , ud_calls = calls1 `unionCalls` calls2 }
2798
2799 -----------------------------
2800 _dictBindBndrs :: Bag DictBind -> [Id]
2801 _dictBindBndrs dbs = foldr ((++) . bindersOf . db_bind) [] dbs
2802
2803 -- | Construct a 'DictBind' from a 'CoreBind'
2804 mkDB :: CoreBind -> DictBind
2805 mkDB bind = DB { db_bind = bind, db_fvs = bind_fvs bind }
2806
2807 -- | Identify the free variables of a 'CoreBind'
2808 bind_fvs :: CoreBind -> VarSet
2809 bind_fvs (NonRec bndr rhs) = pair_fvs (bndr,rhs)
2810 bind_fvs (Rec prs) = foldl' delVarSet rhs_fvs bndrs
2811 where
2812 bndrs = map fst prs
2813 rhs_fvs = unionVarSets (map pair_fvs prs)
2814
2815 pair_fvs :: (Id, CoreExpr) -> VarSet
2816 pair_fvs (bndr, rhs) = exprSomeFreeVars interesting rhs
2817 `unionVarSet` idFreeVars bndr
2818 -- idFreeVars: don't forget variables mentioned in
2819 -- the rules of the bndr. C.f. OccAnal.addRuleUsage
2820 -- Also tyvars mentioned in its type; they may not appear
2821 -- in the RHS
2822 -- type T a = Int
2823 -- x :: T a = 3
2824 where
2825 interesting :: InterestingVarFun
2826 interesting v = isLocalVar v || (isId v && isDFunId v)
2827 -- Very important: include DFunIds /even/ if it is imported
2828 -- Reason: See Note [Avoiding loops in specImports], the #13429
2829 -- example involving an imported dfun. We must know
2830 -- whether a dictionary binding depends on an imported
2831 -- DFun in case we try to specialise that imported DFun
2832
2833 -- | Flatten a set of "dumped" 'DictBind's, and some other binding
2834 -- pairs, into a single recursive binding.
2835 recWithDumpedDicts :: [(Id,CoreExpr)] -> Bag DictBind -> DictBind
2836 recWithDumpedDicts pairs dbs
2837 = DB { db_bind = Rec bindings, db_fvs = fvs }
2838 where
2839 (bindings, fvs) = foldr add ([], emptyVarSet)
2840 (dbs `snocBag` mkDB (Rec pairs))
2841 add (DB { db_bind = bind, db_fvs = fvs }) (prs_acc, fvs_acc)
2842 = case bind of
2843 NonRec b r -> ((b,r) : prs_acc, fvs')
2844 Rec prs1 -> (prs1 ++ prs_acc, fvs')
2845 where
2846 fvs' = fvs_acc `unionVarSet` fvs
2847
2848 snocDictBinds :: UsageDetails -> [DictBind] -> UsageDetails
2849 -- Add ud_binds to the tail end of the bindings in uds
2850 snocDictBinds uds dbs
2851 = uds { ud_binds = ud_binds uds `unionBags` listToBag dbs }
2852
2853 consDictBind :: DictBind -> UsageDetails -> UsageDetails
2854 consDictBind bind uds = uds { ud_binds = bind `consBag` ud_binds uds }
2855
2856 addDictBinds :: [DictBind] -> UsageDetails -> UsageDetails
2857 addDictBinds binds uds = uds { ud_binds = listToBag binds `unionBags` ud_binds uds }
2858
2859 snocDictBind :: UsageDetails -> DictBind -> UsageDetails
2860 snocDictBind uds bind = uds { ud_binds = ud_binds uds `snocBag` bind }
2861
2862 wrapDictBinds :: Bag DictBind -> [CoreBind] -> [CoreBind]
2863 wrapDictBinds dbs binds
2864 = foldr add binds dbs
2865 where
2866 add (DB { db_bind = bind }) binds = bind : binds
2867
2868 wrapDictBindsE :: Bag DictBind -> CoreExpr -> CoreExpr
2869 wrapDictBindsE dbs expr
2870 = foldr add expr dbs
2871 where
2872 add (DB { db_bind = bind }) expr = Let bind expr
2873
2874 ----------------------
2875 dumpUDs :: [CoreBndr] -> UsageDetails -> (UsageDetails, Bag DictBind)
2876 -- Used at a lambda or case binder; just dump anything mentioning the binder
2877 dumpUDs bndrs uds@(MkUD { ud_binds = orig_dbs, ud_calls = orig_calls })
2878 | null bndrs = (uds, emptyBag) -- Common in case alternatives
2879 | otherwise = -- pprTrace "dumpUDs" (ppr bndrs $$ ppr free_uds $$ ppr dump_dbs) $
2880 (free_uds, dump_dbs)
2881 where
2882 free_uds = MkUD { ud_binds = free_dbs, ud_calls = free_calls }
2883 bndr_set = mkVarSet bndrs
2884 (free_dbs, dump_dbs, dump_set) = splitDictBinds orig_dbs bndr_set
2885 free_calls = deleteCallsMentioning dump_set $ -- Drop calls mentioning bndr_set on the floor
2886 deleteCallsFor bndrs orig_calls -- Discard calls for bndr_set; there should be
2887 -- no calls for any of the dicts in dump_dbs
2888
2889 dumpBindUDs :: [CoreBndr] -> UsageDetails -> (UsageDetails, Bag DictBind, Bool)
2890 -- Used at a let(rec) binding.
2891 -- We return a boolean indicating whether the binding itself is mentioned,
2892 -- directly or indirectly, by any of the ud_calls; in that case we want to
2893 -- float the binding itself;
2894 -- See Note [Floated dictionary bindings]
2895 dumpBindUDs bndrs (MkUD { ud_binds = orig_dbs, ud_calls = orig_calls })
2896 = -- pprTrace "dumpBindUDs" (ppr bndrs $$ ppr free_uds $$ ppr dump_dbs) $
2897 (free_uds, dump_dbs, float_all)
2898 where
2899 free_uds = MkUD { ud_binds = free_dbs, ud_calls = free_calls }
2900 bndr_set = mkVarSet bndrs
2901 (free_dbs, dump_dbs, dump_set) = splitDictBinds orig_dbs bndr_set
2902 free_calls = deleteCallsFor bndrs orig_calls
2903 float_all = dump_set `intersectsVarSet` callDetailsFVs free_calls
2904
2905 callsForMe :: Id -> UsageDetails -> (UsageDetails, [CallInfo])
2906 callsForMe fn (MkUD { ud_binds = orig_dbs, ud_calls = orig_calls })
2907 = -- pprTrace ("callsForMe")
2908 -- (vcat [ppr fn,
2909 -- text "Orig dbs =" <+> ppr (_dictBindBndrs orig_dbs),
2910 -- text "Orig calls =" <+> ppr orig_calls,
2911 -- text "Dep set =" <+> ppr dep_set,
2912 -- text "Calls for me =" <+> ppr calls_for_me]) $
2913 (uds_without_me, calls_for_me)
2914 where
2915 uds_without_me = MkUD { ud_binds = orig_dbs
2916 , ud_calls = delDVarEnv orig_calls fn }
2917 calls_for_me = case lookupDVarEnv orig_calls fn of
2918 Nothing -> []
2919 Just cis -> filterCalls cis orig_dbs
2920 -- filterCalls: drop calls that (directly or indirectly)
2921 -- refer to fn. See Note [Avoiding loops (DFuns)]
2922
2923 ----------------------
2924 filterCalls :: CallInfoSet -> Bag DictBind -> [CallInfo]
2925 -- Remove dominated calls (Note [Specialising polymorphic dictionaries])
2926 -- and loopy DFuns (Note [Avoiding loops (DFuns)])
2927 filterCalls (CIS fn call_bag) dbs
2928 | isDFunId fn -- Note [Avoiding loops (DFuns)] applies only to DFuns
2929 = filter ok_call de_dupd_calls
2930 | otherwise -- Do not apply it to non-DFuns
2931 = de_dupd_calls -- See Note [Avoiding loops (non-DFuns)]
2932 where
2933 de_dupd_calls = remove_dups call_bag
2934
2935 dump_set = foldl' go (unitVarSet fn) dbs
2936 -- This dump-set could also be computed by splitDictBinds
2937 -- (_,_,dump_set) = splitDictBinds dbs {fn}
2938 -- But this variant is shorter
2939
2940 go so_far (DB { db_bind = bind, db_fvs = fvs })
2941 | fvs `intersectsVarSet` so_far
2942 = extendVarSetList so_far (bindersOf bind)
2943 | otherwise = so_far
2944
2945 ok_call (CI { ci_fvs = fvs }) = fvs `disjointVarSet` dump_set
2946
2947 remove_dups :: Bag CallInfo -> [CallInfo]
2948 remove_dups calls = foldr add [] calls
2949 where
2950 add :: CallInfo -> [CallInfo] -> [CallInfo]
2951 add ci [] = [ci]
2952 add ci1 (ci2:cis) | ci2 `beats_or_same` ci1 = ci2:cis
2953 | ci1 `beats_or_same` ci2 = ci1:cis
2954 | otherwise = ci2 : add ci1 cis
2955
2956 beats_or_same :: CallInfo -> CallInfo -> Bool
2957 beats_or_same (CI { ci_key = args1 }) (CI { ci_key = args2 })
2958 = go args1 args2
2959 where
2960 go [] _ = True
2961 go (arg1:args1) (arg2:args2) = go_arg arg1 arg2 && go args1 args2
2962 go (_:_) [] = False
2963
2964 go_arg (SpecType ty1) (SpecType ty2) = isJust (tcMatchTy ty1 ty2)
2965 go_arg UnspecType UnspecType = True
2966 go_arg (SpecDict {}) (SpecDict {}) = True
2967 go_arg UnspecArg UnspecArg = True
2968 go_arg _ _ = False
2969
2970 ----------------------
2971 splitDictBinds :: Bag DictBind -> IdSet -> (Bag DictBind, Bag DictBind, IdSet)
2972 -- splitDictBinds dbs bndrs returns
2973 -- (free_dbs, dump_dbs, dump_set)
2974 -- where
2975 -- * dump_dbs depends, transitively on bndrs
2976 -- * free_dbs does not depend on bndrs
2977 -- * dump_set = bndrs `union` bndrs(dump_dbs)
2978 splitDictBinds dbs bndr_set
2979 = foldl' split_db (emptyBag, emptyBag, bndr_set) dbs
2980 -- Important that it's foldl' not foldr;
2981 -- we're accumulating the set of dumped ids in dump_set
2982 where
2983 split_db (free_dbs, dump_dbs, dump_idset) db
2984 | DB { db_bind = bind, db_fvs = fvs } <- db
2985 , dump_idset `intersectsVarSet` fvs -- Dump it
2986 = (free_dbs, dump_dbs `snocBag` db,
2987 extendVarSetList dump_idset (bindersOf bind))
2988
2989 | otherwise -- Don't dump it
2990 = (free_dbs `snocBag` db, dump_dbs, dump_idset)
2991
2992
2993 ----------------------
2994 deleteCallsMentioning :: VarSet -> CallDetails -> CallDetails
2995 -- Remove calls mentioning any Id in bndrs
2996 -- NB: The call is allowed to mention TyVars in bndrs
2997 -- Note [Specialising polymorphic dictionaries]
2998 -- ci_fvs are just the free /Ids/
2999 deleteCallsMentioning bndrs calls
3000 = mapDVarEnv (ciSetFilter keep_call) calls
3001 where
3002 keep_call (CI { ci_fvs = fvs }) = fvs `disjointVarSet` bndrs
3003
3004 deleteCallsFor :: [Id] -> CallDetails -> CallDetails
3005 -- Remove calls *for* bndrs
3006 deleteCallsFor bndrs calls = delDVarEnvList calls bndrs
3007
3008 {-
3009 ************************************************************************
3010 * *
3011 \subsubsection{Boring helper functions}
3012 * *
3013 ************************************************************************
3014 -}
3015
3016 type SpecM a = UniqSM a
3017
3018 runSpecM :: SpecM a -> CoreM a
3019 runSpecM thing_inside
3020 = do { us <- getUniqueSupplyM
3021 ; return (initUs_ us thing_inside) }
3022
3023 mapAndCombineSM :: (a -> SpecM (b, UsageDetails)) -> [a] -> SpecM ([b], UsageDetails)
3024 mapAndCombineSM _ [] = return ([], emptyUDs)
3025 mapAndCombineSM f (x:xs) = do (y, uds1) <- f x
3026 (ys, uds2) <- mapAndCombineSM f xs
3027 return (y:ys, uds1 `plusUDs` uds2)
3028
3029 extendTvSubstList :: SpecEnv -> [(TyVar,Type)] -> SpecEnv
3030 extendTvSubstList env tv_binds
3031 = env { se_subst = Core.extendTvSubstList (se_subst env) tv_binds }
3032
3033 substTy :: SpecEnv -> Type -> Type
3034 substTy env ty = Core.substTy (se_subst env) ty
3035
3036 substCo :: SpecEnv -> Coercion -> Coercion
3037 substCo env co = Core.substCo (se_subst env) co
3038
3039 substBndr :: SpecEnv -> CoreBndr -> (SpecEnv, CoreBndr)
3040 substBndr env bs = case Core.substBndr (se_subst env) bs of
3041 (subst', bs') -> (env { se_subst = subst' }, bs')
3042
3043 substBndrs :: SpecEnv -> [CoreBndr] -> (SpecEnv, [CoreBndr])
3044 substBndrs env bs = case Core.substBndrs (se_subst env) bs of
3045 (subst', bs') -> (env { se_subst = subst' }, bs')
3046
3047 cloneBindSM :: SpecEnv -> CoreBind -> SpecM (SpecEnv, SpecEnv, CoreBind)
3048 -- Clone the binders of the bind; return new bind with the cloned binders
3049 -- Return the substitution to use for RHSs, and the one to use for the body
3050 cloneBindSM env@(SE { se_subst = subst, se_interesting = interesting }) (NonRec bndr rhs)
3051 = do { us <- getUniqueSupplyM
3052 ; let (subst', bndr') = Core.cloneIdBndr subst us bndr
3053 interesting' | interestingDict env rhs
3054 = interesting `extendVarSet` bndr'
3055 | otherwise = interesting
3056 ; return (env, env { se_subst = subst', se_interesting = interesting' }
3057 , NonRec bndr' rhs) }
3058
3059 cloneBindSM env@(SE { se_subst = subst, se_interesting = interesting }) (Rec pairs)
3060 = do { us <- getUniqueSupplyM
3061 ; let (subst', bndrs') = Core.cloneRecIdBndrs subst us (map fst pairs)
3062 env' = env { se_subst = subst'
3063 , se_interesting = interesting `extendVarSetList`
3064 [ v | (v,r) <- pairs, interestingDict env r ] }
3065 ; return (env', env', Rec (bndrs' `zip` map snd pairs)) }
3066
3067 newDictBndr :: SpecEnv -> CoreBndr -> SpecM CoreBndr
3068 -- Make up completely fresh binders for the dictionaries
3069 -- Their bindings are going to float outwards
3070 newDictBndr env b = do { uniq <- getUniqueM
3071 ; let n = idName b
3072 ty' = substTy env (idType b)
3073 ; return (mkUserLocal (nameOccName n) uniq Many ty' (getSrcSpan n)) }
3074
3075 newSpecIdSM :: Id -> Type -> Maybe JoinArity -> SpecM Id
3076 -- Give the new Id a similar occurrence name to the old one
3077 newSpecIdSM old_id new_ty join_arity_maybe
3078 = do { uniq <- getUniqueM
3079 ; let name = idName old_id
3080 new_occ = mkSpecOcc (nameOccName name)
3081 new_id = mkUserLocal new_occ uniq Many new_ty (getSrcSpan name)
3082 `asJoinId_maybe` join_arity_maybe
3083 ; return new_id }
3084
3085 {-
3086 Old (but interesting) stuff about unboxed bindings
3087 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
3088
3089 What should we do when a value is specialised to a *strict* unboxed value?
3090
3091 map_*_* f (x:xs) = let h = f x
3092 t = map f xs
3093 in h:t
3094
3095 Could convert let to case:
3096
3097 map_*_Int# f (x:xs) = case f x of h# ->
3098 let t = map f xs
3099 in h#:t
3100
3101 This may be undesirable since it forces evaluation here, but the value
3102 may not be used in all branches of the body. In the general case this
3103 transformation is impossible since the mutual recursion in a letrec
3104 cannot be expressed as a case.
3105
3106 There is also a problem with top-level unboxed values, since our
3107 implementation cannot handle unboxed values at the top level.
3108
3109 Solution: Lift the binding of the unboxed value and extract it when it
3110 is used:
3111
3112 map_*_Int# f (x:xs) = let h = case (f x) of h# -> _Lift h#
3113 t = map f xs
3114 in case h of
3115 _Lift h# -> h#:t
3116
3117 Now give it to the simplifier and the _Lifting will be optimised away.
3118
3119 The benefit is that we have given the specialised "unboxed" values a
3120 very simple lifted semantics and then leave it up to the simplifier to
3121 optimise it --- knowing that the overheads will be removed in nearly
3122 all cases.
3123
3124 In particular, the value will only be evaluated in the branches of the
3125 program which use it, rather than being forced at the point where the
3126 value is bound. For example:
3127
3128 filtermap_*_* p f (x:xs)
3129 = let h = f x
3130 t = ...
3131 in case p x of
3132 True -> h:t
3133 False -> t
3134 ==>
3135 filtermap_*_Int# p f (x:xs)
3136 = let h = case (f x) of h# -> _Lift h#
3137 t = ...
3138 in case p x of
3139 True -> case h of _Lift h#
3140 -> h#:t
3141 False -> t
3142
3143 The binding for h can still be inlined in the one branch and the
3144 _Lifting eliminated.
3145
3146
3147 Question: When won't the _Lifting be eliminated?
3148
3149 Answer: When they at the top-level (where it is necessary) or when
3150 inlining would duplicate work (or possibly code depending on
3151 options). However, the _Lifting will still be eliminated if the
3152 strictness analyser deems the lifted binding strict.
3153 -}