Fig. 1

N- or C-terminal truncated forms of CGGBP1 affect global cytosine methylation pattern differently: (A) The Full-length (F-len) CGGBP1 corresponds to UniProt ID Q9UFW8, to which a C-terminal spacer and a eGFP-tag have been appended. The amino acid locations are numbered and a nuclear localization signal-containing region (NLS) is highlighted in yellow. An N-terminal part (N-term; amino acid positions 1–90) was generated with the same C-terminal NLS, spacer and eGFP tag. A C-terminal part (C-term; amino acid locations 79–167 was generated with the same tagging with eGFP except that the NLS was positioned at the N-terminal end. The eGFP tagging was used to establish expression and localization in live as well as fixed cells. Expected molecular weights of the proteins coded by the constructs are mentioned on the right of each construct in kDa. The locations of shRNA used to target only the N-term (shA) or only the C-term (shB) are indicated. (B) Subcellular localisation of eGFP-tagged CGGBP1 shows strong nuclear presence of F-len and N-term. The presence of C-term however is prominently enhanced in the cytoplasm. A quantification of nuclear (nuc) and cytoplasmic (cyt) signals for each sample is shown in the lowermost panel (N-nuc vs. N-cyt: n = 20, p value < 0.0001; C-nuc vs. C-cyt: n = 20, p value = 0.0637; F-nuc vs. F-cyt: n = 20, p value < 0.0001). A paired t-test between nuc/cyt ratios shows that the nuc/cyt ratio is significantly decreased in C-term only. (C) Western blot analysis of nuc and cyt fractions shows the enhanced presence of C-term in the cytoplasmic fraction. The nuc and cyt fractions are marked by the detection of total histone H3 and GAPDH respectively. The expected band locations as per the calculated molecular weights of the eGFP-tagged forms are indicated by a yellow rectangle in each lane. Ladder bands are labelled with molecular weights in kilodaltons. (D) PCA plot depicting variance between global cytosine methylation patterns in F-len (red), N-term (green) and C-term (blue) determined by MeDIP-seq. As compared to the Input (pink), the variance in MeDIP has two components: the largest component (PC1 on the X-axis) accounting for MeDIP enrichment which segregates Input from a tight cluster of F-len, N-term and C-term and the second largest component (PC2 on the Y-axis) accounting for differential MeDIP enrichment between the samples. In a combined 97.9% of total variance the N-term and F-len show strong resemblance of cytosine methylation patterns distinct from that of the C-term. (E) A genome-wide scatter of MeDIP signals shows a stronger correlation between F-len and N-term MeDIP signals as compared to C-term MeDIP signals which show similar lower correlations with F-len or N-term. (F) The MeDIP signals are consistently higher in C-term between signal bins 5 and 50. The X-axis bins show the number of MeDIP reads per 0.2 kb non-overlapping genomic segments. Y-axis values are counts of these 0.2 kb genomic segments. The cytosine methylation distribution shows that F-len maintains low cytosine methylation in the signal range 5–50 MeDIP reads per 0.2 kb (blue). Truncation mutations lead to an increase in cytosine methylation in this signal range with the highest increase observed in C-term. The higher area under the curve for C-term is also reflected in the scatter plots (E) in which C-term values show a prominent concentration of data points below 300 on the C-term axis