understanding I() in lmer formula

(    V(A)     cov( A, B ) )   ( f   g )
  Sigma = (                         ) = (       )
          ( cov( A, B )     V(B)    )   ( g   h )

  so the aim of lme4 (or any other software) is to find the << best >>
  values for all of these five parameters.

So when I run lmer I should be able to recover these 5 values, right?
How do I do that?

When you say it finds the best values, that means to me that the
objective function (whatever we're trying to minimize) includes 
something that depends on those parameters.
What is that function? 
I guess you have to pick some particular model in order to show its
objective function, and I suggest the simplest possible model, in
this case something like
 reaction ~ 1+days + (1+days|subject)
and similarly the || version, which I gather differs only in that
some g terms are left out.

  This is done by trying to
  find the values that allow to say << I obtained my data because they
  were the more likely to occur >>. This leads to a complex function that
  often reduce to least-squares, but not always, and in Gaussian
  mixed-effects models are not linear least-squares because of the f, g
  and h parameters.

  Traditionnally, you fix uA = uB = 0 because should they have other
  values, they could not be distinguished from the fixed part of the
  model (but you could also say that there is no distinct fixed part and
  that you try to find their values, it's the same).

  When you use (1|Day), you allow lmer to fit a model with f, g and h
  variable.

  When you use (1||Day), you constrain lmer to fit a model with g = 0
  and only f and h can be fitted.

All of this makes sense to me, but in order to really understand what's
going on I want to know the objective function.

  Note that with lmer, f, h (and g if allowed) are not obtained by first
  computing slope and intercept for all subjects, then doing usual
  descriptive statistics ...

I understand, but however the software actually works, I should be able
to see that the objective function is minimized by simply computing it
on the output and then also on various perturbations of the output.
That would tell me WHAT the software is doing.  I could worry later about
HOW it does that.

  Last note: do not confuse the correlation between A and B, the random
  effects in the population, given by g, and the correlation between the
  estimators of the (mean) slope [uA] and the estimator of the (mean)
  intercept [uB], M_A and M_B, which may be what you had in mind when
  saying that A and B values were correlated << before >> (it exists in
  usual linear regression). They correspond to different ideas:

  g means that there is in the population a kind of constraint
   between A and B ;

  the correlation between the _estimators_ means that any error on
   the estimation of the (mean) slope will lead to an error in the
   estimation of the (mean) intercept and it is a property of the
   method used to find them, not of the data/underlying world.

I had not even thought about the second of those.  
But I think that is similar to one of the outputs of summary(lmer(...))
where it says correlation of fixed effects.

  Hope this clarifies,

So far I don't think I've learned anything new, but I may be getting
close to describing to you what I'm missing.