Epigenetic regulations play a central role in governing the embryo development and somatic cell reprogramming. Taking advantage of recent advances in low-input sequencing techniques, researchers have uncovered a comprehensive view of the epigenetic landscape during rapid transcriptome transitions involved in the cell fate commitment. The well-organized epigenetic reprogramming also highlights the essential roles of specific epigenetic regulators to support efficient regulation of transcription activity and chromatin remodeling. This review briefly introduces the recent progress in the molecular dynamics and regulation mechanisms implicated in mouse early embryo development and somatic cell reprograming, as well as the multi-omics regulatory mechanisms of totipotency mediated by several key factors, which provide valuable resources for further investigations on the complicated regulatory network in essential biological events.
Graphical Abstract
Introduction
Cell fate transition is accomplished by global changes in the transcriptome and thus the proteome, which is mainly achieved through epigenetic regulations. Epigenetic marks including DNA methylation, histone modifications, higher order chromatin structures, and noncoding RNAs regulate the gene expression without altering the DNA sequence, and these marks may exhibit great dynamics to support the transcriptome transition. The preimplantation development provides an ideal model for investigating the roles of epigenetic regulation in the cell fate transition. In mice, the genome-wide epigenetic status of embryos undergoes profound reprogramming upon fertilization to establish the totipotency of early embryos, and the remodeling of epigenetic marks is also extensively involved in the subsequent cell fate commitment.
Besides the in vivo development, cell fate could also be reprogrammed to the totipotent or pluripotent state through somatic cell nuclear transfer (SCNT) or generation of induced pluripotent stem cells (iPSCs), in which epigenetic regulations also play a major role. In recent years, with the advances in low-input high-throughput sequencing, a series of studies were able to uncover the epigenetic dynamics during mouse early embryo development initiated from fertilization or SCNT, and these discoveries in the detailed chromatin state not only contribute to an unprecedented view of the developmental process, but also provide valuable clues for implicated regulatory mechanisms which require further investigations.
In this review, we summarize some of the recent progress in studying the epigenetic regulations of embryos after fertilization, SCNT embryos, and iPSCs, and also illustrate several studies uncovering the regulatory roles of essential factors in developmental control and pluripotency maintenance.
Multi-omics remodeling of embryonic development in mice
The transition from the maternally inherited environment to the zygotically activated program is a crucial step shortly after fertilization, which is termed maternal-to-zygotic transition (MZT) and mostly completed by the two-cell (2C) stage in mice [1]. MZT is achieved through the clearance of maternally stored RNAs and proteins, along with the transcription activation of the zygotic genome [1–3]. The establishment of cell polarity at the morula stage and the lineage differentiation at the blastocyst stage is also accompanied by dramatic changes in the transcriptome and epigenome [4]. Therefore, investigations in the genetic and epigenetic dynamics during mammalian embryo development may provide inspiring clues for the underlying molecular mechanisms (Figure 1).
Proteome transitions during development of oocytes and preimplantation embryos
As mentioned above, a variety of RNAs and proteins are produced and stored during the growth of oocytes, and the composition of these maternal factors has attracted a lot of interest. A previous study reported that transcripts of genes associated with ribosome assembly and translation are particularly abundant throughout the growth phases of oocytes, and the overrepresentation of these genes reflects the high demand for protein synthesis, which is in accordance with the ∼150-fold increase in oocyte volume during the follicle development [5]. To identify maternal proteins which might contribute to the reprogramming of the terminally differentiated germ cells after fertilization in mice, one study profiled the proteome of germinal vesicle (GV) oocytes, metaphase II (MII) oocytes, and zygotes using liquid chromatography with tandem mass spectrometry (LC–MS/MS) [6]. It was found that oocytes and zygotes possess more unique protein families compared with embryonic stem cells (ESCs), and the majority of these proteins are involved in the regulation of self-renewal and cell cycle. Moreover, the overrepresented proteins are enriched in differential pathways at each stage. GV oocytes contain more metabolism-related proteins which may support the oocyte maturation; MII oocytes contain many cell-cycle regulators, epigenetic modifiers, and pluripotency regulators; the total amount of proteins in zygotes significantly decreases as maternal proteins are quickly degraded upon fertilization, and accordingly ubiquitination-related proteins are highly enriched [6]. This work provides valuable resources for further functional analyses of maternal proteins.
Although the transcriptome dynamics during mouse preimplantation development have been revealed through the single-cell RNA-sequencing and key genes driving the developmental progress have been highlighted [7], the transcriptional architecture may not exactly reflect the proteome landscape in early embryos. Therefore, a quantitative MS strategy was applied to uncover the protein expression dynamics of mouse embryos from the zygote to the blastocyst stage [8]. Proteins could be clustered according to their expression profiles or the temporal profiles of phosphorylation sites in all analyzed stages, and key kinases as well as essential signal pathways were predicted according to the phosphorylation changes [8]. For example, JNK2, CDK2, BARK1, and ERK1 were predicted to be responsible for the active phosphorylation through analyzing the amino acid sequence motifs surrounding the changed phosphorylation sites, and novel phosphorylation events in key enzymes mediating glycogenesis downstream of insulin signaling were identified. Importantly, the identification of protein co-expression modules suggested highly correlated core factors including Rrp9, Cript, and Zcchc8, which were supposed to be indispensable during early development. Rrp9 was proved to be required for the blastocyst formation in this study [8], and the maternal factor Zcchc8 was also shown to play an important role in embryo development [9] (discussed in detail in “Epigenetic cross-talk implicated in the exit of 2C state”). Furthermore, a regulation gap between the expression level of transcripts and proteins is present during the MZT [8], highlighting the necessity for the proteomic profiling.
Figure 1.
Dynamics of transcriptome and core histone modifications during mouse preimplantation development. Upon fertilization, maternal mRNAs start to degrade, and the zygotic genome activation (ZGA) occurs at the two-cell stage to achieve the maternal-to-zygotic transition (MZT). The conversion from the terminally differentiated gametes to the totipotent zygotes also triggers the reprogramming of epigenetic marks. The global occupation of different histone H3 methylations shows diverse dynamics during the reprogramming, and the unique feature of each modification is illustrated on the right. The active mark H3K4me3 forms noncanonical broad domains in MII oocytes, and theses domains are inherited in zygotes but substantially reduced at the two-cell stage, when the canonical H3K4me3 domains start to be established. The H3K4me3 demethylase KDM5B might have roles in controlling the width of H3K4me3 domains. The repressive mark H3K27me3 exhibits a massive loss at the zygote and two-cell stages in promoter regions of parental alleles. When a global erasure of H3K27me3 occurs on the paternal alleles upon fertilization, the distal H3K27me3 domains in non-promoter regions persist on the maternal alleles. The constitutive heterochromatin mark H3K9me3 is implicated in the silencing of long terminal regions (LTRs). H3K9me3 marks on LTRs are progressively established after fertilization and remain a high level during the blastocyst formation. The histone chaperone CHAF1A facilitates H3K9me3 deposition on LTRs and is thus required for LTR silencing in early embryos. Most promoters of coding genes lost H3K9me3 after fertilization, and H3K9me3 domains on these regions are rebuilt at postimplantation stages.

Further quantitative proteomic analyses were also conducted to reveal the metabolic features during meiotic maturation of mouse oocytes [10], and several studies contributed to the understanding of the proteome composition in human oocytes and blastocysts [11–13]. These studies suggest that oocytes are largely resting cells with a proteome that is tailored for homeostasis, cellular attachment and interaction with the environment, and proteins associated with metabolism, assembly, maintenance, and cilium function are primarily in the human blastocyst cells [11, 12]. Taken together, these comprehensive analyses on the proteome of early embryos provide new insights into the molecular mechanism implicated in the developmental events, and the unique pattern of proteome suggests that abundant posttranscriptional regulation might play essential role in these processes.
Unique features and functions of histone methylation during development
Posttranslational modifications (PTMs) on histone N-terminal ends are crucial epigenetic marks regulating chromatin structures and gene expression [14, 15]. It is widely accepted that various kinds of these PTMs including acetylation, methylation, phosphorylation, and ubiquitination are closely correlated with the transcription activity and chromatin openness in early embryos [16–19]. Histone H3 lysine methylation comprises a considerable component of such epigenetic control and has been well characterized in varied biological events [20, 21]. With the development of low-input chromatin immunoprecipitation followed by sequencing (ChIP-seq) methods, the global dynamics of the active mark, histone H3 Lys4 trimethylation (H3K4me3), the repressive mark, histone H3 Lys27 trimethylation (H3K27me3), and the constitutive heterochromatin mark, histone H3 Lys9 trimethylation (H3K9me3), in mouse preimplantation embryos have been systematically profiled [22–26]. Overall, the chromatin state reprogrammed after the zygotic genome activation (ZGA) is significantly different from the maternally established status.
A noncanonical form of H3K4me3 existing as broad domains was observed on promoters and a large number of distal loci in full-grown and mature oocytes, and these broad H3K4me3 domains are negatively correlated with DNA methylation [22, 23]. The noncanonical H3K4me3 is inherited in zygotes and erased at the late 2C stage, and this active removal is required for the ZGA process as well as the full developmental potential [22]. The canonical sharp H3K4me3 domains on promoters are rapidly established after fertilization [23, 24]. It was also found that the H3K4me3 breadth should be precisely regulated since widening H3K4me3 domains through knockdown (KD) of genes encoding H3K4me3 demethylases, especially Kdm5b, blocked the lineage differentiation at the blastocyst stage [24]. Broad H3K4me3 domains in promoter regions may indicate a high transcription level after fertilization [24], and the transition from noncanonical to canonical H3K4me3 at the 2C stage requires the activation of zygotic transcription [23].
In contrast to the quick establishment of H3K4me3, the global gain of H3K27me3 is relatively mild and the number of H3K27me3-marked promoters obviously decreases at the 2C stage compared with MII oocytes. Particularly, the extensive loss of promoter H3K27me3 at Hox and other developmental genes was observed upon fertilization [24, 25]. When a global erasure of H3K27me3 occurs in paternal alleles, non-promoter H3K27me3 domains persist in maternal alleles, and these distal H3K27me3 prefers to locate at regions lack both transcription and DNA methylation [25]. Moreover, the existence of H3K4me3 and H3K27me3 tends to be exclusive in early cleavage stages, but this nonoverlapping distribution pattern significantly weakens at the blastocyst stage when the first lineage differentiation takes place [24].
Different from the H3K4me3 and H3K27me3 domains, which are commonly enriched at gene promoters, H3K9me3 domains are mainly resided in distal regions and associated with heterochromatin formation. H3K9me3 has been proved to be an important player in silencing lineage-inappropriate genes [27, 28], and H3K9me3 establishment is believed to be responsible for the transcriptional silencing of long terminal repeats (LTRs) in mouse ESCs [29–31]. The systematic profiling reveals that H3K9me3-marked domains most dramatically increase at the 2C stage, and H3K9me3 preserves a high level at LTRs throughout the preimplantation stages but shows a low enrichment at promoters before implantation [26]. Interestingly, asymmetric H3K9me3 signals were observed in paternal and maternal genomes, and this imbalance was attributed to not only the different inheritance from gametes but also the differential reprogramming in parental genomes [26]. It is widely accepted that histone modifications instead of DNA methylation are the major epigenetic marks responsible for LTR silencing in preimplantation embryos, as global DNA demethylation occurs upon fertilization [32]. Consistently, a gradual substitution of DNA methylation to H3K9me3 was observed on specific LTRs which are transiently expressed from the 2C to the eight-cell stage and subsequently silenced [26]. The histone chaperone CHAF1A was further proved to play an essential role in the silencing of these LTRs through mediating H3K9me3 establishment, and this regulation of CHAF1A is required for blastocyst formation. Moreover, this study provides evidence for H3K9me3-mediated inactivation of lineage-incompatible genes after implantation, suggesting an important role of H3K9me3 in the lineage specification [26]. Another study also reveals the significance of H3K9me3-mediated silencing in regulating lineage-specific genes during development [33].
Besides H3K27me3, histone H2A Lys119 ubiquitination (H2AK119ub1) is also deposited by the polycomb repressive complex and plays an important role in transcription repression [34]. Sperm-derived H2AK119ub1 is globally erased upon fertilization, and oocyte-inherited H2AK119ub1 is equalized on the paternal and maternal alleles at the two-cell stage [35]. H2AK119ub1 is required for H3K27me3 establishment on a subset of genes in oocytes and prevents premature activation of developmental genes during early development [35–37]. Another histone methylation, H3 Lys36 trimethylation (H3K36me3), often enriches at actively expressed gene bodies and work synergistically with DNA methylation to protect transcribing regions from H3K4me3 and H3K27me3 [38, 39]. H3K36me3 is globally lost on parental alleles upon fertilization and then reestablished during ZGA [39]. Indeed, sufficient H3K36me3 is indispensable for establishing the correct DNA methylome and preventing the invasion of H3K4me3 and H3K27me3 on imprinting control regions [39].
In addition to the histone modifications, the dynamic changes of other epigenetic state such as DNA methylation, chromatin accessibility, and 3D chromatin architecture during development have also been uncovered [40–45]. These studies are continuously improving our knowledge of the overall landscape for epigenetic regulations involved in mammalian development, and provide valuable resources for investigating the regulatory mechanisms.
Understanding the epigenetic barrier of SCNT
The SCNT technique takes advantage of factors within oocytes to trigger the reprogramming of the somatic genome, resulting in a totipotent state which allows the generation of cloned animals [46, 47]. Factors that promote the reprogramming of somatic cells are believed to reside in the cytoplasm of oocytes which direct the remodeling of parental pronuclei following fertilization [48]. Compared with other reprogramming processes, SCNT mediates a faster establishment of totipotency and a developmental process more similar to embryos after fertilization. Human ESC lines can be derived from SCNT embryos and well maintained, raising the possibility for clinical applications of therapeutic cloning [49–51]. However, the developmental process of SCNT embryos is frequently disrupted before the blastocyst stage or after the implantation, which greatly limits the therapeutic applications [52]. The abnormality of cloned embryos might result from the unsuccessful reprogramming of the somatic genome with aberrant expression patterns and/or improper epigenetic marks. Therefore, uncovering the epigenetic dynamics in the developmental stages after SCNT would significantly extend our understanding for the regulatory mechanisms which drive the reprogramming process. A lot of effort has been spent in studying the transcription regulation and chromatin architecture of SCNT embryos (Figure 2). These studies also provide possible solutions for improving the developmental potential of cloned embryos.
Histone and DNA methylome defects inhibiting the faithful development of SCNT embryos
The developmental arrest of cloned embryos before implantation often occurs at the cleavage stages, and it is hard to directly predict the fates of embryos after SCNT or elucidate causes of the limited development potential. To address this problem, an embryo biopsy culture system followed by single-cell RNA-seq was applied, allowing the simultaneous transcriptome profiling and embryo fate tracing [53]. Among the genes downregulated in 2C-arrested SCNT embryos, this study particularly focused on transcription factors and epigenetic regulators, and noticed the presence of the H3K9me3 demethylase gene Kdm4b. Indeed, overexpression (OE) of Kdm4b mRNA in enucleated MII oocytes prior to SCNT would greatly improve the developmental efficiency [53]. Further investigations revealed that Kdm4b OE facilitated the removal of promoter H3K9me3 inherited from somatic cells in 2C SCNT embryos, and the expression level of corresponding genes was largely restored [53]. Using a similar strategy, this study identified Kdm5b as a crucial factor for SCNT embryos to develop beyond the four-cell stage. Strikingly, the combination of Kdm4b and Kdm5b OE even improved the blastocyst rate of SCNT embryos to over 95%, and the derivation rate of ESC lines from SCNT blastocysts and the full-term development rate of cloned embryos were increased accordingly [53]. This work highlights the significance of histone demethylase-mediated H3K9me3 removal for SCNT embryos to exhibit a high developmental potential. Likewise, another study also observed the failure in activating H3K9me3-enriched genes after SCNT, and another H3K9me3 demethylase gene, Kdm4d, was identified as a potential target for improving the cloning efficiency [54]. These studies indicate that H3K9me3 is a critical epigenetic barrier for SCNT-mediated reprogramming. Importantly, the injection of Kdm4d mRNA would also improve blastocyst development and pregnancy rate of transplanted SCNT embryos in surrogate monkeys [55], suggesting a conserved epigenetic regulation for animal cloning.
Figure 2.
Abnormal reprogramming of epigenetic marks in mouse embryos derived from somatic cell nuclear transfer (SCNT). SCNT triggers the reprogramming of differentiated cells to obtain a totipotent state, but the SCNT-mediated reprogramming in epigenome is frequently inefficient, leading to an impaired developmental potential of SCNT embryos. Genome-wide DNA demethylation occurs after SCNT, but many somatic cell-inherited methylation sites resist the erasure, and SCNT embryos exhibit a higher DNA methylation level compared with normal embryos throughout the preimplantation stages until the blastocyst stage. An aberrant DNA re-methylation event occurs from the two-cell (2C) to four-cell stage during SCNT-mediated development, and the re-methylated regions might associate with mis-expression of transcripts important for ZGA. H3K9me3 is also a critical epigenetic barrier for the reprogramming of SCNT embryos, and the supplement of the H3K9me3 demethylase KDM4B could improve the cloning efficiency. H3K9ac modifications inherited from somatic cells are globally and quickly eliminated following the SCNT and restored to a lower level compared with normal embryos. The H3K9ac reduction on promoters of 2C-specific genes might link to the incomplete ZGA process in SCNT embryos. SCNT also triggers an insufficient reprogramming in the higher order chromatin structure, especially in H3K9me3-marked TADs inherited from somatic cells. Differential A/B compartmentation (colorful lines) and stronger TADs (thicker lines in pink) are restored in SCNT embryos compared with normal embryos, and the super enhancer–promoter interaction (SE-P; small triangle in blue) on 2C gene loci rebuilt during normal development is absent in SCNT embryos.

The abnormally high level of DNA methylation was also observed in SCNT embryos, indicating the aberrant reprogramming in another aspect [56]. Using the embryo biopsy system combined with the ultra-low-input whole-genome bisulfite sequencing technology, one study revealed that DNA methylation also exhibits unique patterns during the development of SCNT embryos with different fates [57]. The global DNA methylation level of cloned embryos is significantly higher than normal embryos from donor cells to the four-cell stage, and an increase in the DNA methylation level was unexpectedly observed at the four-cell stage compared with the 2C stage in cloned embryos [57]. Differentially methylated regions (DMRs) in cloned embryos compared with normal embryos were then identified, and re-methylated DMRs which exhibit a higher DNA methylation level than the previous stage during the development of cloned embryos were subsequently defined [57]. This unusual re-methylation might partially account for the downregulation of totipotent- and developmental-related genes in arrested cloned embryos. As expected, inhibiting de novo DNA methylation through KD of the DNA cytosine methyltransferases would reduce the re-methylated regions and restore the expression level of corresponding genes in cloned embryos, and the developmental capacity of these treated embryos was also obviously improved in terms of blastocyst formation and full-term development [57].
Taken together, the above studies demonstrate the disorders of transcriptome and DNA methylome in detail during SCNT-mediated reprogramming at the preimplantation stages, and provide practical strategies for improving the cloning efficiency.
Abnormal histone acetylation and 3D chromatin structure in SCNT embryos
Insufficient histone acetylation might also limit the development of SCNT embryos, since the treatment of a histone deacetylase inhibitor, Trichostatin A (TSA), would significantly improve the cloning efficiency [58, 59]. An earlier study systematically detected the lysine acetylation on core histones during SCNT-mediated preimplantation development in mice using an indirect immunofluorescence method, and showed that the histone acetylation including H3K9ac, H3K14ac, and H4K16ac inherited from somatic cells could be quickly eliminated following the SCNT procedure and restored to a lower level compared with normal embryos after the activation procedure [60]. Indeed, the level of H3K9ac and H3K14ac after TSA treatment following SCNT would be comparable to that in normal embryos, and this improved re-acetylation might contribute to the promoted development of TSA-treated SCNT embryos [60]. The establishment of low-input ChIP-seq enabled researchers to compare the histone acetylation dynamics in SCNT and normal embryos in more detail. Strikingly, a significantly lower level of H3K9ac at promoters of 2C-specific genes in SCNT embryos compared with normal embryos was observed throughout the 2C stage, and the H3K9ac reduction might link to the failed activation of these genes in SCNT embryos [61]. Among the transcription factors that potentially facilitate gene activation at the 2C stage through regulating H3K9ac occupancy, DUX was noticed as its ectopic expression would greatly help SCNT embryos overcome the 2C barrier and promote full-term development in a dosage-dependent manner. This promotion might be achieved through the H3K9ac restoration at promoters of DUX-targeted and/or 2C-specific genes, and the promotive role of TSA treatment in cloning efficiency might also rely on DUX-mediated regulation [61]. Therefore, sufficient H3K9ac might be an epigenetic mark for successful reprogramming after SCNT, and both the TSA treatment and Dux OE might restore this modification to facilitate the development of SCNT embryos.
In addition to histone modifications, the chromatin higher order structure might also be abnormally regulated in cloning embryos. Metazoan chromosomes show an additional segregation as multi-megabase domains called compartments A/B, which are associated with actively and inactively transcribed chromatin, respectively [62]. Linear chromosomes further fold into arrays of self-interacting domains called topologically associating domains (TADs), which have extensive internal chromatin interactions inside and rarely contact with neighboring regions [63, 64]. The small-scale in situ Hi-C (sisHi-C) technique revealed that although TADs and compartments A/B inherited from somatic donors are rapidly dissolved after the SCNT procedure and progressively reestablished following metaphase exit, weaker distal interactions, differential compartmentations as well as stronger TAD structures are present in SCNT embryos at early cleavage stages, indicating that SCNT might not mediate a timely reprogramming in the chromatin architectures [65]. More importantly, a disturbed super enhancer–promoter interaction (SE-P) of the essential 2C-specific gene Zscan4d was observed in SCNT embryos at the early 2C stage. Meanwhile, reprogramming-resistant regions prefer to reside within H3K9me3-marked TADs in SCNT embryos, and H3K9me3-marked TADs inherited from somatic cells resist reprogramming [65]. Complement of the H3K9me3 demethylase gene Kdm4d before SCNT could significantly reduce H3K9me3-marked TADs and restore the interaction between the super enhancer and promoter at Zscan4d locus, highlighting the inhibitive role of H3K9me3 for SCNT-mediated reprogramming in terms of higher order chromatin architectures [65]. Another study also revealed the stronger TADs in SCNT embryos compared with normal embryos at the one-cell stage and a multistep reprogramming of 3D chromatin architecture after SCNT [66].
Despite the above progress, the development of SCNT embryos is still challenging since they frequently exhibit defects even after reaching the blastocyst stage. Therefore, the reprogramming barrier of the SCNT process requires further elucidation, and detailed data for other omics such as RNA posttranscriptional regulations should also be considered. In addition, novel factors responsible for efficient reprogramming might be defined when studying the remodeling of parental pronuclei during normal development. One study was designed to demonstrate the differential composition of these reprogramming factors in the male and female pronuclei, and revealed that supplement of proper factors might promote the remodeling of paternal genome [67]. The differential reprogramming capacities of parental factors also indicate that the parental genomes might have distinct reprogramming barriers to overcome, providing essential clues for improving the efficiency of SCNT-mediated reprogramming.
Epigenetic reprogramming during the induction of pluripotent stem cells
As another type of reprogrammed cells, iPSCs are generated from differentiated somatic cells via introduction of four defined transcription factors including Oct3/4, Sox2, c-Myc, and Klf4 [68]. The tetraploid blastocyst complementation assay has proved that iPSCs have the ability to generate full-term mice, suggesting that iPSCs have the full pluripotency to differentiate into all the cell lineages under appropriate conditions [69, 70]. Therefore, iPSCs derived from patients can serve as a promising cell source for disease modeling and the development of regenerative medicines without ethical issues. For example, one study generated iPSCs from fibroblasts of a β-thalassemia patient with a homozygous deletion in the β-globin gene [71]. This iPSC strain was genetically corrected by homologous recombination and forced to be differentiated into hematopoietic progenitors, whose transplantation would recover the normal hematopoiesis in sub-lethally irradiated mice [71]. The iPSC methodology is also applied in epilepsy disease modeling and cell-based therapy [72], highlighting the values of investigations on the molecular regulations involved in the iPSC induction.
Important roles of POU5F1 in promoting the reprogramming during iPSC generation
It has been reported that the four core factors show distinct binding patterns toward their target genes during iPSC induction, and POU5F1 was found to serve as a pioneer factor which could bind the compacted motif regions and trigger the opening of adjacent chromatin state [73] One study was focused on the dynamic changes of genome-wide POU5F1 occupancy and core histone modifications including H3K4me1, H3K4me3, H3K27ac, and H3K27me3 throughout the mouse somatic cell reprogramming process [74]. The global POU5F1 binding would be facilitated by the accessible status of chromatin and negatively regulated by DNA methylation [74]. For promoter regions, the dynamic POU5F1 occupancy is positively correlated with H3K4me3 and negatively correlated with H3K27me3 [74]. Intriguingly, the functional state defined by H3K4me1 and H3K27ac statuses of nearly all the putative enhancers undergoes active transitions during reprogramming, and the H3K4me1 mark possibly preset a context for POU5F1 binding at enhancer regions, followed by the deposition of H3K27ac [74]. Notably, the timing of POU5F1 binding and transcriptional activation in enhancers of individual pluripotency-related genes is synchronized, and POU5F1 occupies the targets at different time points [74]. Taken together, this study demonstrates a well-organized and hierarchical manner of POU5F1 binding during iPSC induction, and this sequential POU5F1 occupation at regulatory elements is in concert with the dynamics of specific epigenetic marks for pluripotent gene activation.
Promotive roles of hydroxymethylation in iPSC reprogramming
Similar to SCNT-mediated reprogramming, the presence of DNA methylation and H3K9me3 inherited from somatic cells is also the epigenetic barrier for faithful reprogramming during iPSC induction [75, 76]. Among TET family proteins responsible for initiating DNA demethylation through 5-methylcytosine (5mC) oxidation [77], only Tet1 shows a progressive upregulation throughout the reprogramming process of mouse fibroblasts [78]. TET1 is recruited by the essential pluripotency inducer NANOG to enhance the expression of a subset of key reprogramming target genes, suggesting a novel role of the 5-hydroxymethylcytosine (5hmC) in the pluripotency establishment [79]. Consistently, KD and OE of Tet1 would abolish and promote the formation of iPSC colonies, respectively, and this promoting role of TET1 is dependent on its catalytic domain [78]. Further investigations demonstrate a simultaneous 5mC-to-5hmC conversion and TET1 binding increase at specific loci of Pou5f1 gene at early stages in OSKM-induced reprogramming, suggesting that TET1 triggers the reactivation of Pou5f1 to promote the reprogramming through active DNA demethylation [78]. Importantly, Tet1 could replace Pou5f1 to induce pluripotency more efficiently in the TSKM system, resulting in fully pluripotent iPSCs that are able to complete full-term development [78]. During the TSKM-mediated reprograming, the somatic-to-pluripotent transition in transcriptome might be modulated by the DNA methylation dynamics, as both the 5mC increase at iPSC-silenced genes and 5hmC conversion at ESC-active genes were observed [78]. Overall, the above studies highlight the indispensable role of TET1-mediated DNA demethylation in the epigenetic remodeling and transcriptional transition for the pluripotency acquisition.
Despite the promising prospect of iPSCs in therapeutic applications, the clinical safety in regards to the possibly inferior quality of iPSC strains could be a significant concern, and one study indeed observed a failure of telomere lengthening as well as a severe dysfunction of telomeres and mitochondria in telomerase-deficient iPSCs compared with ESCs derived from SCNT blastocysts with the same background [80]. Nevertheless, future studies focusing on characteristics and determinants of the epigenetic reorganization during iPSC generation might provide valuable clues for understanding the regulatory mechanisms triggering this reprogramming event.
Key factors regulating the entrance and exit of totipotency
Among the early embryos, zygote and 2C blastomeres are thought to be totipotent cells with the ability to generate all the embryonic and extraembryonic tissues. The in vivo totipotency establishment is thought to be the result of minor and major ZGA, which is potentially facilitated by the 2C-specific genes Dux and Zscan4 [81–84]. Meanwhile, MERVL retrotransposons, a specific class of LTRs, are also transiently activated at the 2C stage, leading to expression of certain 2C-specific chimeric transcripts [85, 86]. A small subset of ESCs arise spontaneously in the culture, known as 2C-like cells, which are found to have an expanded potency, and the activation of MERVL-derived promoters in ESCs could serve as a marker for the entrance into a 2C-like transcriptional state [86]. Nevertheless, the roles of 2C-specific factors such as DUX in ZGA initiation and totipotency regulations are still under debate, and recent studies suggest that multilayer regulatory network might participate in the transition between the 2C-like state and ESC state (see below; Figure 3).
Factors and epigenetic marks triggering the entrance of 2C state
Along with the degradation of maternal factors, successful MZT also involves efficient ZGA, and the major wave of ZGA occurs predominantly at the 2C stage with a dramatic and transient activation of specific genes and retrotransposons [1, 86]. Dux encodes a well-known transcription factor that binds promoters of many ZGA-associated genes and activates their transcription specifically at the 2C stage [82, 83]. However, it was found that homozygous Dux-KO mice could survive to adulthood, although they accounted for a slightly lower frequency than the expected Mendelian ratio [87]. Actually, depletion of Dux would cause a downregulation of ZGA-associated genes and the 2C-specific retrotransposon, MERVL, at the early 2C stage, but the activation of these transcripts could still take place at the late 2C stage in the absence of Dux, indicating that Dux facilitates the timely occurrence of ZGA process but can be substituted by other factors in ZGA initiation [87]. Similar conclusions have been made on the minor role of Dux in ZGA and early development in another study [88]. Intriguingly, Dux OE at the late 2C stage could be detrimental to embryos, as most of the embryos injected with Dux-EGFP mRNA arrested at the four-cell stage [87]. Therefore, the transiently expressed Dux gene should be rapidly silenced after the 2C stage to ensure the subsequent development.
Figure 3.
Regulatory mechanisms involved in the activation and repression of 2C-like state in mouse embryonic stem cells (ESCs). Pluripotent ESCs derived from mouse blastocysts exhibit characteristics and the developmental potential similar with the inner cell mass of blastocysts. The ESC culture maintains a small population of 2C-like cells with activation of the 2C-specific transcription program, and many factors have been proved to regulate the entrance and exit of the 2C-like state in ESCs. DCAF11 targets KAP1 for ubiquitination-mediated degradation, leading to a lower level of H3K9me3 on the downstream enhancer of the 2C-specific gene Zscan4 and thus the activation of the 2C-like state. The m6A methyltransferase METTL3 catalyzes the methylation of various chromosome-associated RNAs (caRNAs) including transcripts of long interspersed nuclear element 1 (LINE1), and m6A-marked LINE1 RNAs are recognized by the m6A reader YTHDC1, which interacts with ZCCHC8 to mediate the degradation of LINE1. YTHDC1 also recognizes another kind of m6A site on LINE1 transcripts, and YTHDC1 is required for H3K9me3 establishment on 2C-specific retrotransposons mediated by the LINE1-NCL-KAP1 partnership and thus prevents the transition into the 2C-like state. miR-344 is activated by DUX and posttranscriptionally represses ZMYM2 and its partner LSD1. Since ZMYM2 recruits LSD1/HDAC corepressor complex to silence MERVL, a class III endogenous retrovirus served as a marker of the 2C-specific program, the activation of miR-344 leads to the induction of the 2C-like state.

Histone modifications are also shown to associate with 2C activation. De novo H3K4me3 deposition is closely related with the upregulation of 2C-specific genes, and broad H3K4me3 domains are extensively lost in 2C-like cells [89]. Moreover, DUX might facilitate the activation of 2C-specific genes through restoring H3K9ac on promoters as mentioned above [61].
Mechanisms ensuring the maintenance of 2C state
The transcription activities of endogenous MERVL elements and Zscan4 gene are commonly detected to monitor the entrance and exit of the 2C-like state in mouse ESCs [86, 90]. A recent study constructed a mouse ESC line co-expressing MERVL-tdTomato and pZscan4c-EGFP fluorescent reporters (double reporters, DR) and performed genome-wide profiling for chromatin accessibility using ATAC-seq [91]. Among the loci mostly enriched for the ATAC signal in DR+/+ cells, the noncoding miR-344 cluster genes were noticed, and the activation of miR-344 in ESCs could significantly upregulate 2C-specific transcripts and thus promote the generation of 2C-like cells with an expanded developmental potency [91]. Further investigations reveal that miRNAs of miR-344 genes directly target Zmym2 for the posttranscriptional repression, and Zmym2 represses MERVL expression and thus inhibits the 2C-like transition through recruiting HDAC-containing complexes to the MERVL loci [91]. This Zmym2-mediated repression of the 2C-like state facilitates the totipotency-to-pluripotency transition during mouse preimplantation development to ensure a high developmental potential of early embryos [91]. Moreover, this study identified DUX as an upstream activator of miR-344 genes in 2C-like cells.
Telomere length is primarily maintained by telomerase in most types of somatic cells, but during mouse preimplantation development, the telomerase activity remains relatively low and alternative lengthening of telomeres (ALT) plays a major role in the maintenance of telomere length [92]. A previous study on telomerase-deficient ESCs derived from SCNT embryos has implied that the ALT mechanism could be active in ESCs [80], and it was also reported that Zscan4 gene cluster serves as a critical regulator in the ALT of ESCs [93]. Through an RNAi screen in the ESC line containing a Zscan4 promoter-driven EGFP reporter, a recent study found that Dcaf11 might promote Zscan4 activation and thus facilitate ALT [94]. Indeed, Dcaf11 is required for the telomere elongation in mouse embryos especially at the 2C and four-cell stages, and Dcaf11 contributes to the transcriptional activation of 2C-specific genes; Dcaf11 also participates in the telomere maintenance of ESCs in a telomerase-independent manner during long-term culture [94]. Interestingly, DCAF11 interacts with KAP1 and reduces the protein level of KAP1 possibly through ubiquitination-associated degradation, and this DCAF11-mediated degradation of KAP1 is required for Zscan4 activation [94]. Further investigations revealed that KAP1 targets a distal enhancer of Zscan4 gene for deposition of the repressive H3K9me3 marks, and the direct binding of DCAF11 in this region facilitates KAP1 removal and thus activates Zscan4 [94]. Collectively, the above studies establish previously unclear molecular pathways in maintaining the 2C transcriptional activation.
Epigenetic cross-talk implicated in the exit of 2C state
The proteomic study suggests that the maternal factor Zcchc8 might be a core factor regulating mouse preimplantation development [8]. Indeed, homozygous Zcchc8–/– mice showed a much lower survival rate compared with WT and heterozygous mice at late postimplantation stages and after birth, and Zcchc8 is required for the self-renewal and differentiation potency of ESCs [9]. Intrinsically, ZCCHC8 is a core component of the nuclear exosome targeting (NEXT) complex, an RNA exosome cofactor required for the degradation of many nuclear RNA substrates [95, 96]. Further analyses revealed that ZCCHC8 prefers to bind transcripts of the retrotransposon long interspersed nuclear element 1 (LINE1), and Zcchc8 depletion leads to a significantly increased abundance and a prolonged lifetime of LINE1 RNAs in the nucleus of ESCs [9]. This ZCCHC8-mediated LINE1 RNA decay also exists in preimplantation embryos and contributes to the balance of chromatin accessibility which is required for proper development [9]. Consistently, a previous study has shown that prolonged activation of LINE1 prevents the gradual chromatin compaction that occurs naturally after the 2C stage during development, indicating a regulatory role of retrotransposon-derived RNAs in shaping the chromatin state [97].
Certain modifications on nuclear RNAs such as N6-methyladenosine (m6A) are also implicated in the regulation of RNA fates. It has been reported that in mouse ESCs, the methyltransferase METTL3 catalyzes the m6A methylation on various kinds of chromosome-associated RNAs (caRNAs) including promoter-associated RNA (paRNA), enhancer RNA (eRNA), and retrotransposon-transcribed RNA, and these m6A marks are recognized by the nuclear m6A reader YTHDC1, which further recruits the NEXT complex for RNA degradation [98]. Importantly, the m6A-dependent, YTHDC1-mediated RNA decay regulates the abundance of caRNAs and impacts the chromatin openness, which is closely related to the transcription activity [98]. The promotive roles of m6A in reducing the half-life of retrotransposon-derived RNAs were also observed in another study [99]. Intriguingly, in addition to the cis-acting influences, m6A is also involved in the trans-acting regulatory roles of caRNAs. In mouse ESCs, LINE1 RNAs serve as a chromatin scaffold to recruit the interacted Nucleolin-KAP1 proteins, and this LINE1-Nucleolin-KAP1 partnership plays an essential role in the repression of 2C-specific transcription program [100]. A recent study unexpectedly found that YTHDC1 could recognize specific kinds of m6A sites on nuclear LINE1 RNAs and promote the scaffold function of LINE1 through interacting with Nucleolin-KAP1 proteins in mouse ESCs [101]. YTHDC1 thus facilitates the KAP1-mediated H3K9me3 deposition on 2C-specific retrotransposons for transcription repression, and ensures the exit of 2C-like state in ESCs to maintain a high level of self-renewal [101]. YTHDC1 is also shown to interact with the H3K9me3 methyltransferase SETDB1 and repress 2C-specific transcripts to guard the ESC identity [102].
The cross-talk between transcription outputs and chromatin state indicates a complicated network in the totipotency regulation which is remained to be elucidated. Besides, the potential roles of RNA methylation during in vivo totipotency establishment and cell fate transition also require further investigations.
Conclusions and perspectives
Owing to the technological advances in low-input high-throughput sequencing, investigations on the molecular dynamics of early embryos and cloned embryos have made great progress. These studies allow us to better understand the dynamic landscape of transcriptome as well as multiple epigenetic marks including DNA methylation, histone modifications, and higher order chromatin structures at the early developmental stages. In the meantime, the multi-omics analyses provide important clues for unexpected ways of the cross-talk between transcriptome and epigenome, and these researches also highlight the roles of core factors involved in the regulatory network.
Surprisingly, specific factors not only regulate the fates of targets, but also mediate cross-layer regulatory mechanisms that have complicated and far-reaching impacts on the transcription activity. As shown in studies of the 2C state induction, the activation of 2C-specific genes requires removal of the repressive histone modifications, and transcription factors such as DUX and ZSCAN4 which could induce the 2C activation through regulating corresponding histone marks are also implicated in miRNA-mediated regulation and/or telomere elongation [91, 94]. Meanwhile, the timely exit of 2C state ensures the full developmental potential as well as the maintenance of ESCs, and certain RNA modifications along with the related protein factors could contribute to the silence of 2C-specific transcripts through facilitating the establishment of a repressive chromatin state, which is achieved by trans-acting roles of the regulatory RNAs [101]. These complex regulatory networks suggest that a wider perspective is required for future multi-omics researches, and the cross-talk between the chromatin state and transcription outputs should not be ignored.
The comprehensive understanding of the developmental and reprogramming processes as well as the epigenetic regulation mediated by core factors is particularly valuable for application. With the elucidation of molecular dynamics and implicated mechanisms, we would be able to discover novel targets of the regulatory pathways, and clinical applications such as directed differentiation of pluripotent stem cells and self-organization of blastoids might also be further promoted.
Acknowledgments
SG would like to thank all the members of his research team who have contributed to the experiments and data analyses involved in all studies.
Data availability
The data underlying this article will be shared on reasonable request to the corresponding author.