* Tied embeddings in MLP speculator. * Fixing the scale_weight when users decide to not use the speculation as much as defined in the config. * Adding scaling support + optimize some ops.