never executed always true always false
    1 
    2 -- | Constructed Product Result analysis. Identifies functions that surely
    3 -- return heap-allocated records on every code path, so that we can eliminate
    4 -- said heap allocation by performing a worker/wrapper split.
    5 --
    6 -- See https://www.microsoft.com/en-us/research/publication/constructed-product-result-analysis-haskell/.
    7 -- CPR analysis should happen after strictness analysis.
    8 -- See Note [Phase ordering].
    9 module GHC.Core.Opt.CprAnal ( cprAnalProgram ) where
   10 
   11 import GHC.Prelude
   12 
   13 import GHC.Driver.Session
   14 
   15 import GHC.Builtin.Names ( runRWKey )
   16 
   17 import GHC.Types.Var.Env
   18 import GHC.Types.Basic
   19 import GHC.Types.Id
   20 import GHC.Types.Id.Info
   21 import GHC.Types.Demand
   22 import GHC.Types.Cpr
   23 
   24 import GHC.Core.FamInstEnv
   25 import GHC.Core.DataCon
   26 import GHC.Core.Type
   27 import GHC.Core.Utils
   28 import GHC.Core
   29 import GHC.Core.Seq
   30 import GHC.Core.Opt.WorkWrap.Utils
   31 
   32 import GHC.Data.Graph.UnVar -- for UnVarSet
   33 
   34 import GHC.Utils.Outputable
   35 import GHC.Utils.Misc
   36 import GHC.Utils.Panic
   37 import GHC.Utils.Panic.Plain
   38 import GHC.Utils.Logger  ( Logger, putDumpFileMaybe, DumpFormat (..) )
   39 
   40 import Data.List ( mapAccumL )
   41 
   42 {- Note [Constructed Product Result]
   43 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   44 The goal of Constructed Product Result analysis is to identify functions that
   45 surely return heap-allocated records on every code path, so that we can
   46 eliminate said heap allocation by performing a worker/wrapper split
   47 (via 'GHC.Core.Opt.WorkWrap.Utils.mkWWcpr_entry').
   48 
   49 `swap` below is such a function:
   50 ```
   51   swap (a, b) = (b, a)
   52 ```
   53 A `case` on an application of `swap`, like
   54 `case swap (10, 42) of (a, b) -> a + b` could cancel away
   55 (by case-of-known-constructor) if we \"inlined\" `swap` and simplified. We then
   56 say that `swap` has the CPR property.
   57 
   58 We can't inline recursive functions, but similar reasoning applies there:
   59 ```
   60   f x n = case n of
   61     0 -> (x, 0)
   62     _ -> f (x+1) (n-1)
   63 ```
   64 Inductively, `case f 1 2 of (a, b) -> a + b` could cancel away the constructed
   65 product with the case. So `f`, too, has the CPR property. But we can't really
   66 "inline" `f`, because it's recursive. Also, non-recursive functions like `swap`
   67 might be too big to inline (or even marked NOINLINE). We still want to exploit
   68 the CPR property, and that is exactly what the worker/wrapper transformation
   69 can do for us:
   70 ```
   71   $wf x n = case n of
   72     0 -> case (x, 0) of -> (a, b) -> (# a, b #)
   73     _ -> case f (x+1) (n-1) of (a, b) -> (# a, b #)
   74   f x n = case $wf x n of (# a, b #) -> (a, b)
   75 ```
   76 where $wf readily simplifies (by case-of-known-constructor and inlining `f`) to:
   77 ```
   78   $wf x n = case n of
   79     0 -> (# x, 0 #)
   80     _ -> $wf (x+1) (n-1)
   81 ```
   82 Now, a call site like `case f 1 2 of (a, b) -> a + b` can inline `f` and
   83 eliminate the heap-allocated pair constructor.
   84 
   85 Note [Nested CPR]
   86 ~~~~~~~~~~~~~~~~~
   87 We can apply Note [Constructed Product Result] deeper than just the top-level
   88 result constructor of a function, e.g.,
   89 ```
   90   g x
   91     | even x = (x+1,x+2) :: (Int, Int)
   92     | odd  x = (x+2,x+3)
   93 ```
   94 Not only does `g` return a constructed pair, the pair components /also/ have the
   95 CPR property. We can split `g` for its /nested/ CPR property, as follows:
   96 ```
   97   $wg (x :: Int#)
   98     | .. x .. = (# x +# 1#, x +# 2# #) :: (# Int#, Int# #)
   99     | .. x .. = (# x +# 2#, x +# 3# #)
  100   g (I# x) = case $wf x of (# y, z #) -> (I# y, I# z)
  101 ```
  102 Note however that in the following we will only unbox the second component,
  103 even if `foo` has the CPR property:
  104 ```
  105   h x
  106     | even x = (foo x, x+2) :: (Int, Int)
  107     | odd  x = (x+2,   x+3)
  108     -- where `foo` has the CPR property
  109 ```
  110 Why can't we also unbox `foo x`? Because in order to do so, we have to evaluate
  111 it and that might diverge, so we cannot give `h` the nested CPR property in the
  112 first component of the result.
  113 
  114 The Right Thing is to do a termination analysis, to see if we can guarantee that
  115 `foo` terminates quickly, in which case we can speculatively evaluate `foo x` and
  116 hence give `h` a nested CPR property.  That is done in !1866.  But for now we
  117 have an incredibly simple termination analysis; an expression terminates fast
  118 iff it is in HNF: see `exprTerminates`. We call `exprTerminates` in
  119 `cprTransformDataConWork`, which is the main function figuring out whether it's
  120 OK to propagate nested CPR info (in `extract_nested_cpr`).
  121 
  122 In addition to `exprTerminates`, `extract_nested_cpr` also looks at the
  123 `StrictnessMark` of the corresponding constructor field. Example:
  124 ```
  125   data T a = MkT !a
  126   h2 x
  127     | even x = MkT (foo x) :: T Int
  128     | odd  x = MkT (x+2)
  129     -- where `foo` has the CPR property
  130 ```
  131 Regardless of whether or not `foo` terminates, we may unbox the strict field,
  132 because it has to be evaluated (the Core for `MkT (foo x)` will look more like
  133 `case foo x of y { __DEFAULT -> MkT y }`).
  134 
  135 Surprisingly, there are local binders with a strict demand that *do not*
  136 terminate quickly in a sense that is useful to us! The following function
  137 demonstrates that:
  138 ```
  139   j x = (let t = x+1 in t+t, 42)
  140 ```
  141 Here, `t` is used strictly, *but only within its scope in the first pair
  142 component*. `t` satisfies Note [CPR for binders that will be unboxed], so it has
  143 the CPR property, nevertheless we may not unbox `j` deeply lest evaluation of
  144 `x` diverges. The termination analysis must say "Might diverge" for `t` and we
  145 won't unbox the first pair component.
  146 There are a couple of tests in T18174 that show case Nested CPR. Some of them
  147 only work with the termination analysis from !1866.
  148 
  149 Giving the (Nested) CPR property to deep data structures can lead to loss of
  150 sharing; see Note [CPR for data structures can destroy sharing].
  151 
  152 Note [Phase ordering]
  153 ~~~~~~~~~~~~~~~~~~~~~
  154 We need to perform strictness analysis before CPR analysis, because that might
  155 unbox some arguments, in turn leading to more constructed products.
  156 Ideally, we would want the following pipeline:
  157 
  158 1. Strictness
  159 2. worker/wrapper (for strictness)
  160 3. CPR
  161 4. worker/wrapper (for CPR)
  162 
  163 Currently, we omit 2. and anticipate the results of worker/wrapper.
  164 See Note [CPR for binders that will be unboxed].
  165 An additional w/w pass would simplify things, but probably add slight overhead.
  166 So currently we have
  167 
  168 1. Strictness
  169 2. CPR
  170 3. worker/wrapper (for strictness and CPR)
  171 -}
  172 
  173 --
  174 -- * Analysing programs
  175 --
  176 
  177 cprAnalProgram :: Logger -> FamInstEnvs -> CoreProgram -> IO CoreProgram
  178 cprAnalProgram logger fam_envs binds = do
  179   let env            = emptyAnalEnv fam_envs
  180   let binds_plus_cpr = snd $ mapAccumL cprAnalTopBind env binds
  181   putDumpFileMaybe logger Opt_D_dump_cpr_signatures "Cpr signatures" FormatText $
  182     dumpIdInfoOfProgram (ppr . cprSigInfo) binds_plus_cpr
  183   -- See Note [Stamp out space leaks in demand analysis] in GHC.Core.Opt.DmdAnal
  184   seqBinds binds_plus_cpr `seq` return binds_plus_cpr
  185 
  186 -- Analyse a (group of) top-level binding(s)
  187 cprAnalTopBind :: AnalEnv
  188                -> CoreBind
  189                -> (AnalEnv, CoreBind)
  190 cprAnalTopBind env (NonRec id rhs)
  191   = (env', NonRec id' rhs')
  192   where
  193     (id', rhs', env') = cprAnalBind env id rhs
  194 
  195 cprAnalTopBind env (Rec pairs)
  196   = (env', Rec pairs')
  197   where
  198     (env', pairs') = cprFix env pairs
  199 
  200 --
  201 -- * Analysing expressions
  202 --
  203 
  204 -- | The abstract semantic function ⟦_⟧ : Expr -> Env -> A from
  205 -- "Constructed Product Result Analysis for Haskell"
  206 cprAnal, cprAnal'
  207   :: AnalEnv
  208   -> CoreExpr            -- ^ expression to be denoted by a 'CprType'
  209   -> (CprType, CoreExpr) -- ^ the updated expression and its 'CprType'
  210 
  211 cprAnal env e = -- pprTraceWith "cprAnal" (\res -> ppr (fst (res)) $$ ppr e) $
  212                   cprAnal' env e
  213 
  214 cprAnal' _ (Lit lit)     = (topCprType, Lit lit)
  215 cprAnal' _ (Type ty)     = (topCprType, Type ty)      -- Doesn't happen, in fact
  216 cprAnal' _ (Coercion co) = (topCprType, Coercion co)
  217 
  218 cprAnal' env (Cast e co)
  219   = (cpr_ty, Cast e' co)
  220   where
  221     (cpr_ty, e') = cprAnal env e
  222 
  223 cprAnal' env (Tick t e)
  224   = (cpr_ty, Tick t e')
  225   where
  226     (cpr_ty, e') = cprAnal env e
  227 
  228 cprAnal' env e@(Var{})
  229   = cprAnalApp env e []
  230 cprAnal' env e@(App{})
  231   = cprAnalApp env e []
  232 
  233 cprAnal' env (Lam var body)
  234   | isTyVar var
  235   , (body_ty, body') <- cprAnal env body
  236   = (body_ty, Lam var body')
  237   | otherwise
  238   = (lam_ty, Lam var body')
  239   where
  240     -- See Note [CPR for binders that will be unboxed]
  241     env'             = extendSigEnvForArg env var
  242     (body_ty, body') = cprAnal env' body
  243     lam_ty           = abstractCprTy body_ty
  244 
  245 cprAnal' env (Case scrut case_bndr ty alts)
  246   = (res_ty, Case scrut' case_bndr ty alts')
  247   where
  248     (scrut_ty, scrut') = cprAnal env scrut
  249     env'               = extendSigEnv env case_bndr (CprSig scrut_ty)
  250     (alt_tys, alts')   = mapAndUnzip (cprAnalAlt env' scrut_ty) alts
  251     res_ty             = foldl' lubCprType botCprType alt_tys
  252 
  253 cprAnal' env (Let (NonRec id rhs) body)
  254   = (body_ty, Let (NonRec id' rhs') body')
  255   where
  256     (id', rhs', env') = cprAnalBind env id rhs
  257     (body_ty, body')  = cprAnal env' body
  258 
  259 cprAnal' env (Let (Rec pairs) body)
  260   = body_ty `seq` (body_ty, Let (Rec pairs') body')
  261   where
  262     (env', pairs')   = cprFix env pairs
  263     (body_ty, body') = cprAnal env' body
  264 
  265 cprAnalAlt
  266   :: AnalEnv
  267   -> CprType -- ^ CPR type of the scrutinee
  268   -> Alt Var -- ^ current alternative
  269   -> (CprType, Alt Var)
  270 cprAnalAlt env scrut_ty (Alt con bndrs rhs)
  271   = (rhs_ty, Alt con bndrs rhs')
  272   where
  273     env_alt
  274       | DataAlt dc <- con
  275       , let ids = filter isId bndrs
  276       , CprType arity cpr <- scrut_ty
  277       , assert (arity == 0 ) True
  278       = case unpackConFieldsCpr dc cpr of
  279           AllFieldsSame field_cpr
  280             | let sig = mkCprSig 0 field_cpr
  281             -> extendSigEnvAllSame env ids sig
  282           ForeachField field_cprs
  283             | let sigs = zipWith (mkCprSig . idArity) ids field_cprs
  284             -> extendSigEnvList env (zipEqual "cprAnalAlt" ids sigs)
  285       | otherwise
  286       = env
  287     (rhs_ty, rhs') = cprAnal env_alt rhs
  288 
  289 --
  290 -- * CPR transformer
  291 --
  292 
  293 data TermFlag -- Better than using a Bool
  294   = Terminates
  295   | MightDiverge
  296 
  297 -- See Note [Nested CPR]
  298 exprTerminates :: CoreExpr -> TermFlag
  299 exprTerminates e
  300   | exprIsHNF e = Terminates -- A /very/ simple termination analysis.
  301   | otherwise   = MightDiverge
  302 
  303 cprAnalApp :: AnalEnv -> CoreExpr -> [(CprType, CoreArg)] -> (CprType, CoreExpr)
  304 -- Main function that takes care of /nested/ CPR. See Note [Nested CPR]
  305 cprAnalApp env e arg_infos = go e arg_infos []
  306   where
  307     go e arg_infos args'
  308       -- Collect CprTypes for (value) args (inlined collectArgs):
  309       | App fn arg <- e, isTypeArg arg -- Don't analyse Type args
  310       = go fn arg_infos (arg:args')
  311       | App fn arg <- e
  312       , arg_info@(_arg_ty, arg') <- cprAnal env arg
  313       -- See Note [Nested CPR] on the need for termination analysis
  314       = go fn (arg_info:arg_infos) (arg':args')
  315 
  316       | Var fn <- e
  317       = (cprTransform env fn arg_infos, mkApps e args')
  318 
  319       | (e_ty, e') <- cprAnal env e -- e is not an App and not a Var
  320       = (applyCprTy e_ty (length arg_infos), mkApps e' args')
  321 
  322 cprTransform :: AnalEnv               -- ^ The analysis environment
  323              -> Id                    -- ^ The function
  324              -> [(CprType, CoreArg)]  -- ^ info about incoming /value/ arguments
  325              -> CprType               -- ^ The demand type of the application
  326 cprTransform env id args
  327   -- Any local binding, except for data structure bindings
  328   -- See Note [Efficient Top sigs in SigEnv]
  329   | Just sig <- lookupSigEnv env id
  330   = applyCprTy (getCprSig sig) (length args)
  331   -- See Note [CPR for data structures]
  332   | Just rhs <- cprDataStructureUnfolding_maybe id
  333   = fst $ cprAnal env rhs
  334   -- Some (mostly global, known-key) Ids have bespoke CPR transformers
  335   | Just cpr_ty <- cprTransformBespoke id args
  336   = cpr_ty
  337   -- Other local Ids that respond True to 'isDataStructure' but don't have an
  338   -- expandable unfolding, such as NOINLINE bindings. They all get a top sig
  339   | isLocalId id
  340   = assertPpr (isDataStructure id) (ppr id) topCprType
  341   -- See Note [CPR for DataCon wrappers]
  342   | isDataConWrapId id, let rhs = uf_tmpl (realIdUnfolding id)
  343   = fst $ cprAnalApp env rhs args
  344   -- DataCon worker
  345   | Just con <- isDataConWorkId_maybe id
  346   = cprTransformDataConWork (ae_fam_envs env) con args
  347   -- Imported function
  348   | otherwise
  349   = applyCprTy (getCprSig (idCprSig id)) (length args)
  350 
  351 -- | Precise, hand-written CPR transformers for select Ids
  352 cprTransformBespoke :: Id -> [(CprType, CoreArg)] -> Maybe CprType
  353 cprTransformBespoke id args
  354   -- See Note [Simplification of runRW#] in GHC.CoreToStg.Prep
  355   | idUnique id == runRWKey    -- `runRW (\s -> e)`
  356   , [(arg_ty, _arg)] <- args   -- `\s -> e` has CPR type `arg` (e.g. `. -> 2`)
  357   = Just $ applyCprTy arg_ty 1 -- `e` has CPR type `2`
  358   | otherwise
  359   = Nothing
  360 
  361 -- | Get a (possibly nested) 'CprType' for an application of a 'DataCon' worker,
  362 -- given a saturated number of 'CprType's for its field expressions.
  363 -- Implements the Nested part of Note [Nested CPR].
  364 cprTransformDataConWork :: FamInstEnvs -> DataCon -> [(CprType, CoreArg)] -> CprType
  365 cprTransformDataConWork fam_envs con args
  366   | null (dataConExTyCoVars con)  -- No existentials
  367   , wkr_arity <= mAX_CPR_SIZE -- See Note [Trimming to mAX_CPR_SIZE]
  368   , args `lengthIs` wkr_arity
  369   , isRecDataCon fam_envs fuel con /= DefinitelyRecursive -- See Note [CPR for recursive data constructors]
  370   -- , pprTrace "cprTransformDataConWork" (ppr con <+> ppr wkr_arity <+> ppr args) True
  371   = CprType 0 (ConCpr (dataConTag con) (strictZipWith extract_nested_cpr args wkr_str_marks))
  372   | otherwise
  373   = topCprType
  374   where
  375     fuel = 3 -- If we can unbox more than 3 constructors to find a
  376              -- recursive occurrence, then we can just as well unbox it
  377              -- See Note [CPR for recursive data constructors], point (4)
  378     wkr_arity = dataConRepArity con
  379     wkr_str_marks = dataConRepStrictness con
  380     -- See Note [Nested CPR]
  381     extract_nested_cpr (CprType 0 cpr, arg) str
  382       | MarkedStrict <- str              = cpr
  383       | Terminates <- exprTerminates arg = cpr
  384     extract_nested_cpr _ _               = topCpr -- intervening lambda or doesn't terminate
  385 
  386 -- | See Note [Trimming to mAX_CPR_SIZE].
  387 mAX_CPR_SIZE :: Arity
  388 mAX_CPR_SIZE = 10
  389 
  390 --
  391 -- * Bindings
  392 --
  393 
  394 -- Recursive bindings
  395 cprFix :: AnalEnv                    -- Does not include bindings for this binding
  396        -> [(Id,CoreExpr)]
  397        -> (AnalEnv, [(Id,CoreExpr)]) -- Binders annotated with CPR info
  398 cprFix orig_env orig_pairs
  399   = loop 1 init_env init_pairs
  400   where
  401     init_sig id
  402       -- See Note [CPR for data structures]
  403       | isDataStructure id = topCprSig
  404       | otherwise          = mkCprSig 0 botCpr
  405     -- See Note [Initialising strictness] in GHC.Core.Opt.DmdAnal
  406     orig_virgin = ae_virgin orig_env
  407     init_pairs | orig_virgin  = [(setIdCprSig id (init_sig id), rhs) | (id, rhs) <- orig_pairs ]
  408                | otherwise    = orig_pairs
  409     init_env = extendSigEnvFromIds orig_env (map fst init_pairs)
  410 
  411     -- The fixed-point varies the idCprSig field of the binders and and their
  412     -- entries in the AnalEnv, and terminates if that annotation does not change
  413     -- any more.
  414     loop :: Int -> AnalEnv -> [(Id,CoreExpr)] -> (AnalEnv, [(Id,CoreExpr)])
  415     loop n env pairs
  416       | found_fixpoint = (reset_env', pairs')
  417       | otherwise      = loop (n+1) env' pairs'
  418       where
  419         -- In all but the first iteration, delete the virgin flag
  420         -- See Note [Initialising strictness] in GHC.Core.Opt.DmdAnal
  421         (env', pairs') = step (applyWhen (n/=1) nonVirgin env) pairs
  422         -- Make sure we reset the virgin flag to what it was when we are stable
  423         reset_env'     = env'{ ae_virgin = orig_virgin }
  424         found_fixpoint = map (idCprSig . fst) pairs' == map (idCprSig . fst) pairs
  425 
  426     step :: AnalEnv -> [(Id, CoreExpr)] -> (AnalEnv, [(Id, CoreExpr)])
  427     step env pairs = mapAccumL go env pairs
  428       where
  429         go env (id, rhs) = (env', (id', rhs'))
  430           where
  431             (id', rhs', env') = cprAnalBind env id rhs
  432 
  433 -- | Process the RHS of the binding for a sensible arity, add the CPR signature
  434 -- to the Id, and augment the environment with the signature as well.
  435 cprAnalBind
  436   :: AnalEnv
  437   -> Id
  438   -> CoreExpr
  439   -> (Id, CoreExpr, AnalEnv)
  440 cprAnalBind env id rhs
  441   | isDFunId id -- Never give DFuns the CPR property; we'll never save allocs.
  442   = (id,  rhs,  extendSigEnv env id topCprSig)
  443   -- See Note [CPR for data structures]
  444   | isDataStructure id
  445   = (id,  rhs,  env) -- Data structure => no code => no need to analyse rhs
  446   | otherwise
  447   = (id', rhs', env')
  448   where
  449     (rhs_ty, rhs')  = cprAnal env rhs
  450     -- possibly trim thunk CPR info
  451     rhs_ty'
  452       -- See Note [CPR for thunks]
  453       | stays_thunk = trimCprTy rhs_ty
  454       | otherwise   = rhs_ty
  455     -- See Note [Arity trimming for CPR signatures]
  456     sig  = mkCprSigForArity (idArity id) rhs_ty'
  457     id'  = setIdCprSig id sig
  458     env' = extendSigEnv env id sig
  459 
  460     -- See Note [CPR for thunks]
  461     stays_thunk = is_thunk && not_strict
  462     is_thunk    = not (exprIsHNF rhs) && not (isJoinId id)
  463     not_strict  = not (isStrUsedDmd (idDemandInfo id))
  464 
  465 isDataStructure :: Id -> Bool
  466 -- See Note [CPR for data structures]
  467 isDataStructure id =
  468   not (isJoinId id) && idArity id == 0 && isEvaldUnfolding (idUnfolding id)
  469 
  470 -- | Returns an expandable unfolding
  471 -- (See Note [exprIsExpandable] in "GHC.Core.Utils") that has
  472 -- So effectively is a constructor application.
  473 cprDataStructureUnfolding_maybe :: Id -> Maybe CoreExpr
  474 cprDataStructureUnfolding_maybe id
  475   -- There are only FinalPhase Simplifier runs after CPR analysis
  476   | activeInFinalPhase (idInlineActivation id)
  477   , isDataStructure id
  478   = expandUnfolding_maybe (idUnfolding id)
  479   | otherwise
  480   = Nothing
  481 
  482 {- Note [Arity trimming for CPR signatures]
  483 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  484 Although it doesn't affect correctness of the analysis per se, we have to trim
  485 CPR signatures to idArity. Here's what might happen if we don't:
  486 
  487   f x = if expensive
  488           then \y. Box y
  489           else \z. Box z
  490   g a b = f a b
  491 
  492 The two lambdas will have a CPR type of @1m@ (so construct a product after
  493 applied to one argument). Thus, @f@ will have a CPR signature of @2m@
  494 (constructs a product after applied to two arguments).
  495 But WW will never eta-expand @f@! In this case that would amount to possibly
  496 duplicating @expensive@ work.
  497 
  498 (Side note: Even if @f@'s 'idArity' happened to be 2, it would not do so, see
  499 Note [Don't eta expand in w/w].)
  500 
  501 So @f@ will not be worker/wrappered. But @g@ also inherited its CPR signature
  502 from @f@'s, so it *will* be WW'd:
  503 
  504   f x = if expensive
  505           then \y. Box y
  506           else \z. Box z
  507   $wg a b = case f a b of Box x -> x
  508   g a b = Box ($wg a b)
  509 
  510 And the case in @g@ can never cancel away, thus we introduced extra reboxing.
  511 Hence we always trim the CPR signature of a binding to idArity.
  512 
  513 Note [CPR for DataCon wrappers]
  514 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  515 We used to give DataCon wrappers a (necessarily flat) CPR signature in
  516 'GHC.Types.Id.Make.mkDataConRep'. Now we transform DataCon wrappers simply by
  517 analysing their unfolding. A few reasons for the change:
  518 
  519   1. DataCon wrappers are generally inlined in the Final phase (so before CPR),
  520      all leftover occurrences are in a boring context like `f x y = $WMkT y x`.
  521      It's simpler to analyse the unfolding anew at every such call site, and the
  522      unfolding will be pretty cheap to analyse. Also they occur seldom enough
  523      that performance-wise it doesn't matter.
  524   2. 'GHC.Types.Id.Make' no longer precomputes CPR signatures for DataCon
  525      *workers*, because their transformers need to adapt to CPR for their
  526      arguments in 'cprTransformDataConWork' to enable Note [Nested CPR].
  527      Better keep it all in this module! The alternative would be that
  528      'GHC.Types.Id.Make' depends on DmdAnal.
  529   3. In the future, Nested CPR could take a better account of incoming args
  530      in cprAnalApp and do some beta-reduction on the fly, like !1866 did. If
  531      any of those args had the CPR property, then we'd even get Nested CPR for
  532      DataCon wrapper calls, for free. Not so if we simply give the wrapper a
  533      single CPR sig in 'GHC.Types.Id.Make.mkDataConRep'!
  534 
  535 Note [Trimming to mAX_CPR_SIZE]
  536 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  537 We do not treat very big tuples as CPR-ish:
  538 
  539   a) For a start, we get into trouble because there aren't
  540      "enough" unboxed tuple types (a tiresome restriction,
  541      but hard to fix),
  542   b) More importantly, big unboxed tuples get returned mainly
  543      on the stack, and are often then allocated in the heap
  544      by the caller. So doing CPR for them may in fact make
  545      things worse, especially if the wrapper doesn't cancel away
  546      and we move to the stack in the worker and then to the heap
  547      in the wrapper.
  548 
  549 So we (nested) CPR for functions that would otherwise pass more than than
  550 'mAX_CPR_SIZE' fields.
  551 That effect is exacerbated for the unregisterised backend, where we
  552 don't have any hardware registers to return the fields in. Returning
  553 everything on the stack results in much churn and increases compiler
  554 allocation by 15% for T15164 in a validate build.
  555 -}
  556 
  557 data AnalEnv
  558   = AE
  559   { ae_sigs   :: SigEnv
  560   -- ^ Current approximation of signatures for local ids
  561   , ae_virgin :: Bool
  562   -- ^ True only on every first iteration in a fixed-point
  563   -- iteration. See Note [Initialising strictness] in "GHC.Core.Opt.DmdAnal"
  564   , ae_fam_envs :: FamInstEnvs
  565   -- ^ Needed when expanding type families and synonyms of product types.
  566   }
  567 
  568 instance Outputable AnalEnv where
  569   ppr (AE { ae_sigs = env, ae_virgin = virgin })
  570     = text "AE" <+> braces (vcat
  571          [ text "ae_virgin =" <+> ppr virgin
  572          , text "ae_sigs =" <+> ppr env ])
  573 
  574 -- | An environment storing 'CprSig's for local Ids.
  575 -- Puts binders with 'topCprSig' in a space-saving 'IntSet'.
  576 -- See Note [Efficient Top sigs in SigEnv].
  577 data SigEnv
  578   = SE
  579   { se_tops :: !UnVarSet
  580   -- ^ All these Ids have 'topCprSig'. Like a 'VarSet', but more efficient.
  581   , se_sigs :: !(VarEnv CprSig)
  582   -- ^ Ids that have something other than 'topCprSig'.
  583   }
  584 
  585 instance Outputable SigEnv where
  586   ppr (SE { se_tops = tops, se_sigs = sigs })
  587     = text "SE" <+> braces (vcat
  588          [ text "se_tops =" <+> ppr tops
  589          , text "se_sigs =" <+> ppr sigs ])
  590 
  591 emptyAnalEnv :: FamInstEnvs -> AnalEnv
  592 emptyAnalEnv fam_envs
  593   = AE
  594   { ae_sigs = SE emptyUnVarSet emptyVarEnv
  595   , ae_virgin = True
  596   , ae_fam_envs = fam_envs
  597   }
  598 
  599 modifySigEnv :: (SigEnv -> SigEnv) -> AnalEnv -> AnalEnv
  600 modifySigEnv f env = env { ae_sigs = f (ae_sigs env) }
  601 
  602 lookupSigEnv :: AnalEnv -> Id -> Maybe CprSig
  603 -- See Note [Efficient Top sigs in SigEnv]
  604 lookupSigEnv AE{ae_sigs = SE tops sigs} id
  605   | id `elemUnVarSet` tops = Just topCprSig
  606   | otherwise              = lookupVarEnv sigs id
  607 
  608 extendSigEnv :: AnalEnv -> Id -> CprSig -> AnalEnv
  609 -- See Note [Efficient Top sigs in SigEnv]
  610 extendSigEnv env id sig
  611   | isTopCprSig sig
  612   = modifySigEnv (\se -> se{se_tops = extendUnVarSet id (se_tops se)}) env
  613   | otherwise
  614   = modifySigEnv (\se -> se{se_sigs = extendVarEnv (se_sigs se) id sig}) env
  615 
  616 -- | Extend an environment with the (Id, CPR sig) pairs
  617 extendSigEnvList :: AnalEnv -> [(Id, CprSig)] -> AnalEnv
  618 extendSigEnvList env ids_cprs
  619   = foldl' (\env (id, sig) -> extendSigEnv env id sig) env ids_cprs
  620 
  621 -- | Extend an environment with the CPR sigs attached to the ids
  622 extendSigEnvFromIds :: AnalEnv -> [Id] -> AnalEnv
  623 extendSigEnvFromIds env ids
  624   = foldl' (\env id -> extendSigEnv env id (idCprSig id)) env ids
  625 
  626 -- | Extend an environment with the same CPR sig for all ids
  627 extendSigEnvAllSame :: AnalEnv -> [Id] -> CprSig -> AnalEnv
  628 extendSigEnvAllSame env ids sig
  629   = foldl' (\env id -> extendSigEnv env id sig) env ids
  630 
  631 nonVirgin :: AnalEnv -> AnalEnv
  632 nonVirgin env = env { ae_virgin = False }
  633 
  634 -- | A version of 'extendSigEnv' for a binder of which we don't see the RHS
  635 -- needed to compute a 'CprSig' (e.g. lambdas and DataAlt field binders).
  636 -- In this case, we can still look at their demand to attach CPR signatures
  637 -- anticipating the unboxing done by worker/wrapper.
  638 -- See Note [CPR for binders that will be unboxed].
  639 extendSigEnvForArg :: AnalEnv -> Id -> AnalEnv
  640 extendSigEnvForArg env id
  641   = extendSigEnv env id (CprSig (argCprType (idDemandInfo id)))
  642 
  643 -- | Produces a 'CprType' according to how a strict argument will be unboxed.
  644 -- Examples:
  645 --
  646 --   * A head-strict demand @1!L@ would translate to @1@
  647 --   * A product demand @1!P(1!L,L)@ would translate to @1(1,)@
  648 --   * A product demand @1!P(1L,L)@ would translate to @1(,)@,
  649 --     because the first field will not be unboxed.
  650 argCprType :: Demand -> CprType
  651 argCprType dmd = CprType 0 (go dmd)
  652   where
  653     go (n :* sd)
  654       | isAbs n               = topCpr
  655       | Prod Unboxed ds <- sd = ConCpr fIRST_TAG (strictMap go ds)
  656       | Poly Unboxed _  <- sd = ConCpr fIRST_TAG []
  657       | otherwise             = topCpr
  658 
  659 {- Note [Safe abortion in the fixed-point iteration]
  660 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  661 Fixed-point iteration may fail to terminate. But we cannot simply give up and
  662 return the environment and code unchanged! We still need to do one additional
  663 round, to ensure that all expressions have been traversed at least once, and any
  664 unsound CPR annotations have been updated.
  665 
  666 Note [Efficient Top sigs in SigEnv]
  667 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  668 It's pretty common for binders in the SigEnv to have a 'topCprSig'.
  669 Wide records with 100 fields like in T9675 even will generate code where the
  670 majority of binders has Top signature. To save some allocations, we store
  671 those binders with a Top signature in a separate UnVarSet (which is an IntSet
  672 with a convenient Var-tailored API).
  673 
  674 Why store top signatures at all in the SigEnv? After all, when 'cprTransform'
  675 encounters a locally-bound Id without an entry in the SigEnv, it should behave
  676 as if that binder has a Top signature!
  677 Well, the problem is when case binders should have a Top signatures. They always
  678 have an unfolding and thus look to 'cprTransform' as if they bind a data
  679 structure, Note [CPR for data structures], and thus would always have the CPR
  680 property. So we need some mechanism to separate data structures from case
  681 binders with a Top signature, and the UnVarSet provides that in the least
  682 convoluted way I can think of.
  683 
  684 Note [CPR for binders that will be unboxed]
  685 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  686 If a lambda-bound variable will be unboxed by worker/wrapper (so it must be
  687 demanded strictly), then give it a CPR signature. Here's a concrete example
  688 ('f1' in test T10482a), assuming h is strict:
  689 
  690   f1 :: Int -> Int
  691   f1 x = case h x of
  692           A -> x
  693           B -> f1 (x-1)
  694           C -> x+1
  695 
  696 If we notice that 'x' is used strictly, we can give it the CPR
  697 property; and hence f1 gets the CPR property too.  It's sound (doesn't
  698 change strictness) to give it the CPR property because by the time 'x'
  699 is returned (case A above), it'll have been evaluated (by the wrapper
  700 of 'h' in the example).
  701 
  702 Moreover, if f itself is strict in x, then we'll pass x unboxed to
  703 f1, and so the boxed version *won't* be available; in that case it's
  704 very helpful to give 'x' the CPR property.
  705 
  706 This is all done in 'extendSigEnvForArg'.
  707 
  708 Note that
  709 
  710   * Whether or not something unboxes is decided by 'wantToUnboxArg', else we may
  711     get over-optimistic CPR results (e.g., from \(x :: a) -> x!).
  712 
  713   * If the demand unboxes deeply, we can give the binder a /nested/ CPR
  714     property, e.g.
  715 
  716       g :: (Int, Int) -> Int
  717       g p = case p of
  718         (x, y) | x < 0     -> 0
  719                | otherwise -> x
  720 
  721     `x` should have the CPR property because it will be unboxed. We do so
  722     by giving `p` the Nested CPR property `1(1,)`, indicating that we not only
  723     have `p` available unboxed, but also its field `x`. Analysis of the Case
  724     will then transfer the CPR property to `x`.
  725 
  726     Before we were able to express Nested CPR, we used to guess which field
  727     binders should get the CPR property.
  728     See Historic Note [Optimistic field binder CPR].
  729 
  730   * See Note [CPR examples]
  731 
  732 Note [CPR for thunks]
  733 ~~~~~~~~~~~~~~~~~~~~~
  734 If the rhs is a thunk, we usually forget the CPR info, because
  735 it is presumably shared (else it would have been inlined, and
  736 so we'd lose sharing if w/w'd it into a function).  E.g.
  737 
  738         let r = case expensive of
  739                   (a,b) -> (b,a)
  740         in ...
  741 
  742 If we marked r as having the CPR property, then we'd w/w into
  743 
  744         let $wr = \() -> case expensive of
  745                             (a,b) -> (# b, a #)
  746             r = case $wr () of
  747                   (# b,a #) -> (b,a)
  748         in ...
  749 
  750 But now r is a thunk, which won't be inlined, so we are no further ahead.
  751 But consider
  752 
  753         f x = let r = case expensive of (a,b) -> (b,a)
  754               in if foo r then r else (x,x)
  755 
  756 Does f have the CPR property?  Well, no.
  757 
  758 However, if the strictness analyser has figured out (in a previous
  759 iteration) that it's strict, then we DON'T need to forget the CPR info.
  760 Instead we can retain the CPR info and do the thunk-splitting transform
  761 (see WorkWrap.splitThunk).
  762 
  763 This made a big difference to PrelBase.modInt, which had something like
  764         modInt = \ x -> let r = ... -> I# v in
  765                         ...body strict in r...
  766 r's RHS isn't a value yet; but modInt returns r in various branches, so
  767 if r doesn't have the CPR property then neither does modInt
  768 Another case I found in practice (in Complex.magnitude), looks like this:
  769                 let k = if ... then I# a else I# b
  770                 in ... body strict in k ....
  771 (For this example, it doesn't matter whether k is returned as part of
  772 the overall result; but it does matter that k's RHS has the CPR property.)
  773 Left to itself, the simplifier will make a join point thus:
  774                 let $j k = ...body strict in k...
  775                 if ... then $j (I# a) else $j (I# b)
  776 With thunk-splitting, we get instead
  777                 let $j x = let k = I#x in ...body strict in k...
  778                 in if ... then $j a else $j b
  779 This is much better; there's a good chance the I# won't get allocated.
  780 
  781 But what about botCpr? Consider
  782     lvl = error "boom"
  783     fac -1 = lvl
  784     fac 0 = 1
  785     fac n = n * fac (n-1)
  786 fac won't have the CPR property here when we trim every thunk! But the
  787 assumption is that error cases are rarely entered and we are diverging anyway,
  788 so WW doesn't hurt.
  789 
  790 Should we also trim CPR on DataCon application bindings?
  791 See Note [CPR for data structures]!
  792 
  793 Note [CPR for data structures]
  794 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  795 Long static data structures (whether top-level or not) like
  796 
  797   xs = x1 : xs1
  798   xs1 = x2 : xs2
  799   xs2 = x3 : xs3
  800 
  801 should not get (nested) CPR signatures (#18154), because they
  802 
  803   * Never get WW'd, so their CPR signature should be irrelevant after analysis
  804     (in fact the signature might even be harmful for that reason)
  805   * Would need to be inlined/expanded to see their constructed product
  806   * BUT MOST IMPORTANTLY, Problem P1:
  807     Recording CPR on them blows up interface file sizes and is redundant with
  808     their unfolding. In case of Nested CPR, this blow-up can be quadratic!
  809     Reason: the CPR info for xs1 contains the CPR info for xs; the CPR info
  810     for xs2 contains that for xs1. And so on.
  811     By contrast, the size of unfoldings and types stays linear. That's why
  812     quadratic blowup is problematic; it makes an asymptotic difference.
  813 
  814 Hence (Solution S1) we don't give data structure bindings a CPR *signature* and
  815 hence don't to analyse them in 'cprAnalBind'.
  816 What do we mean by "data structure binding"? Answer:
  817 
  818   (1) idArity id == 0    (otherwise it's a function)
  819   (2) is eval'd          (otherwise it's a thunk, Note [CPR for thunks] applies)
  820   (3) not (isJoinId id)  (otherwise it's a function and its more efficient to
  821                           analyse it just once rather than at each call site)
  822 
  823 But (S1) leads to a new Problem P2: We can't just stop giving DataCon application
  824 bindings the CPR *property*, for example the factorial function after FloatOut
  825 
  826   lvl = I# 1#
  827   fac 0 = lvl
  828   fac n = n * fac (n-1)
  829 
  830 lvl is a data structure, and hence (see above) will not have a CPR *signature*.
  831 But if lvl doesn't have the CPR *property*, fac won't either and we allocate a
  832 box for the result on every iteration of the loop.
  833 
  834 So (Solution S2) when 'cprAnal' meets a variable lacking a CPR signature to
  835 extrapolate into a CPR transformer, 'cprTransform' tries to get its unfolding
  836 (via 'cprDataStructureUnfolding_maybe'), and analyses that instead.
  837 
  838 The Result R1: Everything behaves as if there was a CPR signature, but without
  839 the blowup in interface files.
  840 
  841 There is one exception to (R1):
  842 
  843   x   = (y, z); {-# NOINLINE x #-}
  844   f p = (y, z); {-# NOINLINE f #-}
  845 
  846 While we still give the NOINLINE *function* 'f' the CPR property (and WW
  847 accordingly, see Note [Worker/wrapper for NOINLINE functions]), we won't
  848 give the NOINLINE *data structure* 'x' the CPR property, because it lacks an
  849 unfolding. In particular, KindRep bindings are NOINLINE data structures (see
  850 the noinline wrinkle in Note [Grand plan for Typeable]). We'll behave as if the
  851 bindings had 'topCprSig', and that is fine, as a case on the binding would never
  852 cancel away after WW!
  853 
  854 It's also worth pointing out how ad-hoc (S1) is: If we instead had
  855 
  856     f1 x = x:[]
  857     f2 x = x : f1 x
  858     f3 x = x : f2 x
  859     ...
  860 
  861 we still give every function an ever deepening CPR signature. But it's very
  862 uncommon to find code like this, whereas the long static data structures from
  863 the beginning of this Note are very common because of GHC's strategy of ANF'ing
  864 data structure RHSs.
  865 
  866 Note [CPR for data structures can destroy sharing]
  867 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  868 In Note [CPR for data structures], we argued that giving data structure bindings
  869 the CPR property is useful to give functions like fac the CPR property:
  870 
  871   lvl = I# 1#
  872   fac 0 = lvl
  873   fac n = n * fac (n-1)
  874 
  875 Worker/wrappering fac for its CPR property means we get a very fast worker
  876 function with type Int# -> Int#, without any heap allocation at all.
  877 
  878 But consider what happens if we call `map fac (replicate n 0)`, where the
  879 wrapper doesn't cancel away: Then we rebox the result of $wfac *on each call*,
  880 n times, instead of reusing the static thunk for 1, e.g. an asymptotic increase
  881 in allocations. If you twist it just right, you can actually write programs that
  882 that take O(n) space if you do CPR and O(1) if you don't:
  883 
  884   fac :: Int -> Int
  885   fac 0 = 1 -- this clause will trigger CPR and destroy sharing for O(n) space
  886   -- fac 0 = lazy 1 -- this clause will prevent CPR and run in O(1) space
  887   fac n = n * fac (n-1)
  888 
  889   const0 :: Int -> Int
  890   const0 n = signum n - 1 -- will return 0 for [1..n]
  891   {-# NOINLINE const0 #-}
  892 
  893   main = print $ foldl' (\acc n -> acc + lazy n) 0 $ map (fac . const0) [1..100000000]
  894 
  895 Generally, this kind of asymptotic increase in allocation can happen whenever we
  896 give a data structure the CPR property that is bound outside of a recursive
  897 function. So far we don't have a convincing remedy; giving fac the CPR property
  898 is just too attractive. #19309 documents a futile idea. #13331 tracks the
  899 general issue of WW destroying sharing and also contains above reproducer.
  900 #19326 is about CPR destroying sharing in particular.
  901 
  902 With Nested CPR, sharing can also be lost within the same "lambda level", for
  903 example:
  904 
  905   f (I# x) = let y = I# (x*#x) in (y, y)
  906 
  907 Nestedly unboxing would destroy the box shared through 'y'. (Perhaps we can call
  908 this "internal sharing", in contrast to "external sharing" beyond lambda or even
  909 loop levels above.) But duplicate occurrences like that are pretty rare and may
  910 never lead to an asymptotic difference in allocations of 'f'.
  911 
  912 Note [CPR for recursive data constructors]
  913 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  914 Note [CPR for data structures can destroy sharing] gives good reasons not to
  915 give shared data structure bindings the CPR property. But we shouldn't even
  916 give *functions* that return *recursive* data constructor applications the CPR
  917 property. Here's an example for why:
  918 
  919   c = C# 'a'
  920   replicateC :: Int -> [Int]
  921   replicateC 1 = [c]
  922   replicateC n = c : replicateC (n-1)
  923 
  924 What happens if we give `replicateC` the (nested) CPR property? We get a WW
  925 split for 'replicateC', the wrapper of which is certain to inline, like this:
  926 
  927   replicateC (I# n) = case $wreplicateC n of (# x, xs #) -> C# x : xs
  928   $wreplicateC 1# = (# 'a', [] #)
  929   $wreplicateC n  = (# 'a', replicateC (I# (n -# 1#)) #)
  930 
  931 Eliminating the shared 'c' binding in the process. And then
  932 
  933   * We *might* save allocation of the topmost (of most likely several) (:)
  934     constructor if it cancels away at the call site. Similarly for the 'C#'
  935     constructor.
  936   * But we will now re-allocate the C# box on every iteration of the loop,
  937     because we separated the character literal from the C# application.
  938     That means n times as many C# allocations as before. Yikes!!
  939   * We make all other call sites where the wrapper inlines a bit larger, most of
  940     them for no gain. But this shouldn't matter much.
  941   * The inlined wrapper may inhibit eta-expansion in some cases. Here's how:
  942     If the wrapper is inlined in a strict arg position, the Simplifier will
  943     transform as follows
  944 
  945       f (replicateC n)
  946       ==> { inline }
  947       f (case $wreplicateC n of (# x, xs #) -> (C# x, xs))
  948       ==> { strict arg }
  949       case $wreplicateC n of (# x, xs #) -> f (C# x, xs)
  950 
  951     Now we can't float out the case anymore. In fact, we can't even float out
  952     `$wreplicateC n`, because it returns an unboxed tuple.
  953     This can inhibit eta-expansion if we later find out that `f` has arity > 1
  954     (such as when we define `foldl` in terms of `foldr`). #19970 shows how
  955     abstaining from worker/wrappering made a difference of -20% in reptile. So
  956     while WW'ing for CPR didn't make the program slower directly, the resulting
  957     program got much harder to optimise because of the returned unboxed tuple
  958     (which can't easily float because unlifted).
  959 
  960 `replicateC` comes up in T5536, which regresses significantly if CPR'd nestedly.
  961 
  962 What can we do about it?
  963 
  964  A. Don't CPR functions that return a *recursive data type* (the list in this
  965     case). This is the solution we adopt. Rationale: the benefit of CPR on
  966     recursive data structures is slight, because it only affects the outer layer
  967     of a potentially massive data structure.
  968  B. Don't CPR any *recursive function*. That would be quite conservative, as it
  969     would also affect e.g. the factorial function.
  970  C. Flat CPR only for recursive functions. This prevents the asymptotic
  971     worsening part arising through unsharing the C# box, but it's still quite
  972     conservative.
  973  D. No CPR at occurrences of shared data structure in hot paths (e.g. the use of
  974     `c` in the second eqn of `replicateC`). But we'd need to know which paths
  975     were hot. We want such static branch frequency estimates in #20378.
  976 
  977 We adopt solution (A) It is ad-hoc, but appears to work reasonably well.
  978 Deciding what a "recursive data constructor" is is quite tricky and ad-hoc, too:
  979 See Note [Detecting recursive data constructors]. We don't have to be perfect
  980 and can simply keep on unboxing if unsure.
  981 
  982 Note [Detecting recursive data constructors]
  983 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  984 What qualifies as a "recursive data constructor" as per
  985 Note [CPR for recursive data constructors]? That is up to
  986 'GHC.Core.Opt.WorkWrapW.Utils.isRecDataCon' to decide. It does a DFS search over
  987 the field types of the DataCon and looks for term-level recursion into the data
  988 constructor's type constructor. A few perhaps surprising points:
  989 
  990   1. It deems any function type as non-recursive, because it's unlikely that
  991      a recursion through a function type builds up a recursive data structure.
  992   2. It doesn't look into kinds or coercion types because there's nothing to unbox.
  993      Same for promoted data constructors.
  994   3. We don't care whether a NewTyCon or DataTyCon App is fully saturated or not;
  995      we simply look at its definition/DataCons and its field tys. Any recursive arg
  996      occs will have been detected before (see the invariant of 'go_tc_app').
  997      This is so that we expand the `ST` in `StateT Int (ST s) a`.
  998   4. We don't recurse deeper than 3 (at the moment of this writing) TyCons and
  999      assume the DataCon is non-recursive after that. One reason is guaranteed
 1000      constant-time efficiency; the other is that it's fair to say that a recursion
 1001      over 3 or more TyCons doesn't really count as a list-like data structure
 1002      anymore and a bit of unboxing doesn't hurt much.
 1003   5. It checks TyConApps like `T <huge> <type>` by eagerly checking the
 1004      potentially huge argument types *before* it tries to expand the
 1005      DataCons/NewTyCon/TyFams/etc. so that it doesn't need to re-check those
 1006      argument types after having been substituted into every occurrence of
 1007      the the respective TyCon parameter binders. It's like call-by-value vs.
 1008      call-by-name: Eager checking of argument types means we only need to check
 1009      them exactly once.
 1010      There's one exception to that rule, namely when we are able to reduce a
 1011      TyFam by considering argument types. Then we pay the price of potentially
 1012      checking the same type arg twice (or more, if the TyFam is recursive).
 1013      It should hardly matter.
 1014   6. As a result of keeping the implementation simple, it says "recursive"
 1015      for `data T = MkT [T]`, even though we could argue that the inner recursion
 1016      (through the `[]` TyCon) by way of which `T` is recursive will already be
 1017      "broken" and thus never unboxed. Consequently, it might be OK to CPR a
 1018      function returning `T`. Lacking arguments for or against the current simple
 1019      behavior, we stick to it.
 1020   7. When the search hits an abstract TyCon (one without visible DataCons, e.g.,
 1021      from an .hs-boot file), it returns 'Nothing' for "inconclusive", the same
 1022      as when we run out of fuel. If there is ever a recursion through an
 1023      abstract TyCon, then it's not part of the same function we are looking at,
 1024      so we can treat it as if it wasn't recursive.
 1025 
 1026 Here are a few examples of data constructors or data types with a single data
 1027 con and the answers of our function:
 1028 
 1029   data T = T (Int, (Bool, Char))               NonRec
 1030   (:)                                          Rec
 1031   []                                           NonRec
 1032   data U = U [Int]                             NonRec
 1033   data U2 = U2 [U2]                            Rec     (see point (6))
 1034   data T1 = T1 T2; data T2 = T2 T1             Rec
 1035   newtype Fix f = Fix (f (Fix f))              Rec
 1036   data N = N (Fix (Either Int))                NonRec
 1037   data M = M (Fix (Either M))                  Rec
 1038   data F = F (F -> Int)                        NonRec  (see point (1))
 1039   data G = G (Int -> G)                        NonRec  (see point (1))
 1040   newtype MyM s a = MyM (StateT Int (ST s) a   NonRec
 1041   type S = (Int, Bool)                         NonRec
 1042 
 1043   { type family E a where
 1044       E Int = Char
 1045       E (a,b) = (E a, E b)
 1046       E Char = Blub
 1047     data Blah = Blah (E (Int, (Int, Int)))     NonRec  (see point (5))
 1048     data Blub = Blub (E (Char, Int))           Rec
 1049     data Blub2 = Blub2 (E (Bool, Int))     }   Rec, because stuck
 1050 
 1051   { data T1 = T1 T2; data T2 = T2 T3;
 1052     ... data T5 = T5 T1                    }   Nothing (out of fuel)  (see point (4))
 1053 
 1054   { module A where -- A.hs-boot
 1055       data T
 1056     module B where
 1057       import {-# SOURCE #-} A
 1058       data U = MkU T
 1059       f :: T -> U
 1060       f t = MkU t                              Nothing (T is abstract)  (see point (7))
 1061     module A where -- A.hs
 1062       import B
 1063       data T = MkT U }
 1064 
 1065 These examples are tested by the testcase RecDataConCPR.
 1066 
 1067 I've played with the idea to make points (1) through (3) of 'isRecDataCon'
 1068 configurable like (4) to enable more re-use throughout the compiler, but haven't
 1069 found a killer app for that yet, so ultimately didn't do that.
 1070 
 1071 Note [CPR examples]
 1072 ~~~~~~~~~~~~~~~~~~~
 1073 Here are some examples (stranal/should_compile/T10482a) of the
 1074 usefulness of Note [Optimistic field binder CPR].  The main
 1075 point: all of these functions can have the CPR property.
 1076 
 1077     ------- f1 -----------
 1078     -- x is used strictly by h, so it'll be available
 1079     -- unboxed before it is returned in the True branch
 1080 
 1081     f1 :: Int -> Int
 1082     f1 x = case h x x of
 1083             True  -> x
 1084             False -> f1 (x-1)
 1085 
 1086     ------- f3 -----------
 1087     -- h is strict in x, so x will be unboxed before it
 1088     -- is rerturned in the otherwise case.
 1089 
 1090     data T3 = MkT3 Int Int
 1091 
 1092     f1 :: T3 -> Int
 1093     f1 (MkT3 x y) | h x y     = f3 (MkT3 x (y-1))
 1094                   | otherwise = x
 1095 
 1096 Historic Note [Optimistic field binder CPR]
 1097 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 1098 This Note describes how we used to guess whether fields have the CPR property
 1099 before we were able to express Nested CPR for arguments.
 1100 
 1101 Consider
 1102 
 1103   data T a = MkT a
 1104   f :: T Int -> Int
 1105   f x = ... (case x of
 1106     MkT y -> y) ...
 1107 
 1108 And assume we know from strictness analysis that `f` is strict in `x` and its
 1109 field `y` and we unbox both. Then we give `x` the CPR property according
 1110 to Note [CPR for binders that will be unboxed]. But `x`'s sole field `y`
 1111 likewise will be unboxed and it should also get the CPR property. We'd
 1112 need a *nested* CPR property here for `x` to express that and unwrap one level
 1113 when we analyse the Case to give the CPR property to `y`.
 1114 
 1115 Lacking Nested CPR (hence this Note is historic now that we have Nested CPR), we
 1116 have to guess a bit, by looking for
 1117 
 1118   (A) Flat CPR on the scrutinee
 1119   (B) A variable scrutinee. Otherwise surely it can't be a parameter.
 1120   (C) Strict demand on the field binder `y` (or it binds a strict field)
 1121 
 1122 While (A) is a necessary condition to give a field the CPR property, there are
 1123 ways in which (B) and (C) are too lax, leading to unsound analysis results and
 1124 thus reboxing in the wrapper:
 1125 
 1126   (b) We could scrutinise some other variable than a parameter, like in
 1127 
 1128         g :: T Int -> Int
 1129         g x = let z = foo x in -- assume `z` has CPR property
 1130               case z of MkT y -> y
 1131 
 1132       Lacking Nested CPR and multiple levels of unboxing, only the outer box
 1133       of `z` will be available and a case on `y` won't actually cancel away.
 1134       But it's simple, and nothing terrible happens if we get it wrong. e.g.
 1135       #10694.
 1136 
 1137   (c) A strictly used field binder doesn't mean the function is strict in it.
 1138 
 1139         h :: T Int -> Int -> Int
 1140         h !x 0 = 0
 1141         h  x 0 = case x of MkT y -> y
 1142 
 1143       Here, `y` is used strictly, but the field of `x` certainly is not and
 1144       consequently will not be available unboxed.
 1145       Why not look at the demand of `x` instead to determine whether `y` is
 1146       unboxed? Because the 'idDemandInfo' on `x` will not have been propagated
 1147       to its occurrence in the scrutinee when CprAnal runs directly after
 1148       DmdAnal.
 1149 
 1150 We used to give the case binder the CPR property unconditionally instead of
 1151 deriving it from the case scrutinee.
 1152 See Historic Note [Optimistic case binder CPR].
 1153 
 1154 Historic Note [Optimistic case binder CPR]
 1155 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 1156 We used to give the case binder the CPR property unconditionally, which is too
 1157 optimistic (#19232). Here are the details:
 1158 
 1159 Inside the alternative, the case binder always has the CPR property, meaning
 1160 that a case on it will successfully cancel.
 1161 Example:
 1162   f True  x = case x of y { I# x' -> if x' ==# 3
 1163                                      then y
 1164                                      else I# 8 }
 1165   f False x = I# 3
 1166 By giving 'y' the CPR property, we ensure that 'f' does too, so we get
 1167   f b x = case fw b x of { r -> I# r }
 1168   fw True  x = case x of y { I# x' -> if x' ==# 3 then x' else 8 }
 1169   fw False x = 3
 1170 Of course there is the usual risk of re-boxing: we have 'x' available boxed
 1171 and unboxed, but we return the unboxed version for the wrapper to box. If the
 1172 wrapper doesn't cancel with its caller, we'll end up re-boxing something that
 1173 we did have available in boxed form.
 1174 
 1175 -}