Seurat中的数据可视化方法

原文:Data visualization methods in Seurat

原文发布日期:2023-10-31

We’ll demonstrate visualization techniques in Seurat using our previously computed Seurat object from the 2,700 PBMC tutorial. You can download this dataset from SeuratData。官方教程是通过SeuratData包的InstallData函数来下载案例数据,可能需要开启全局代理才能下载:

```{r}
#| eval: false
devtools::install_github('satijalab/seurat-data')
SeuratData::InstallData("pbmc3k")
```
```{r}
#| eval: false
library(SeuratData)
pbmc3k.final <- LoadData("pbmc3k", type = "pbmc3k.final")
pbmc3k.final$groups <- sample(c("group1", 
                                "group2"), 
                              size = ncol(pbmc3k.final), 
                              replace = TRUE)
pbmc3k.final
```

这里我将下载下来的数据保存起来:

```{r}
#| eval: false
saveRDS(pbmc3k.final, file = "data/seurat_official/pbmc3k.final.rds")
```

后面我们直接从本地读取这个Seurat对象:

pbmc3k.final <- readRDS("data/seurat_official/pbmc3k.final.rds")

1 5种可视化marker gene的方法

定义要检查的marker gene:

features <- c("LYZ", "CCL5", "IL32", "PTPRCAP", "FCGR3A", "PF4")

1.1 Ridge plots

Ridge plots - from ggridges. Visualize single cell expression distributions in each cluster

library(Seurat)
RidgePlot(pbmc3k.final, features = features, ncol = 2)

1.2 Violin plot

VlnPlot(pbmc3k.final, features = features)

Violin plots can be split on some variable. Simply add the splitting variable to object metadata and pass it to the split.by argument. 通过添加split.by参数,展示marker gene在不同的样本组别中的表达。

VlnPlot(pbmc3k.final, 
        features = "percent.mt", 
        split.by = "groups")

1.3 Feature plot

Visualize feature expression in low-dimensional space

FeaturePlot(pbmc3k.final, features = features)

1.3.1FeaturePlot的进一步修饰

原始图像:

FeaturePlot(pbmc3k.final, features = "MS4A1")

Adjust the contrast in the plot。通过min.cutoffmax.cutoff调整颜色范围。

FeaturePlot(pbmc3k.final, features = "MS4A1", 
            min.cutoff = 1, max.cutoff = 3)

调整颜色:

FeaturePlot(pbmc3k.final, features = "MS4A1", cols = c("lightblue", "red"))

# 通过RColorBrewer包取色
library(RColorBrewer)
FeaturePlot(pbmc3k.final, features = "MS4A1", cols = brewer.pal(n = 10, name = "RdBu"))

也可以通过ggplot语法修改颜色:

library(ggplot2)
FeaturePlot(pbmc3k.final, features = "MS4A1") +
  scale_colour_gradientn(colours = rev(brewer.pal(n = 10, name = "RdBu")))

Calculate feature-specific contrast levels based on quantiles of non-zero expression. Particularly useful when plotting multiple markers。

FeaturePlot(pbmc3k.final, 
            features = c("MS4A1", "PTPRCAP"), 
            min.cutoff = "q10", 
            max.cutoff = "q90")

多基因FeaturePlot:

通过添加split.by参数,来按照不同的样本组别来分别展示marker gene的表达。

FeaturePlot(pbmc3k.final, 
            features = c("CD3D", "MS4A1", "CD79A"), 
            split.by = "groups")

FeaturePlot()中还提供了blend 参数,来可视化两个基因的共表达情况(添加blend = TRUE)。注意blend = TRUE只能适用于2个基因,多个基因会报错 。

FeaturePlot(pbmc3k.final, 
            features = c("MS4A1", "CD79A"), 
            blend = TRUE)

如果想实现多个基因的话,可以使用scCustomize包中的Plot_Density_Joint_Only()函数绘制多基因联合密度图。该函数还依赖于Nebulosa包,因此还需要先从BiocManager安装该包:

install.packages("scCustomize")
BiocManager::install("Nebulosa")
library(scCustomize)
Plot_Density_Joint_Only(seurat_object = pbmc3k.final, 
                        features = c("CD3D", "MS4A1", "CD79A"),
                        custom_palette = BlueAndRed())

1.4 Dot plots

The size of the dot corresponds to the percentage of cells expressing the feature in each cluster. The color represents the average expression level

DotPlot(pbmc3k.final, 
        features = features) + 
  RotatedAxis()

通过添加split.by参数,来按照不同的样本组别来分别展示marker gene的表达。

DotPlot(pbmc3k.final, 
        features = features, 
        split.by = "groups") + 
  RotatedAxis()

1.5 Heatmap

DoHeatmap(subset(pbmc3k.final, downsample = 100), 
          features = features, 
          size = 3)

DoHeatmap now shows a grouping bar, splitting the heatmap into groups or clusters. This can be changed with the group.by parameter. 默认的group.by为细胞分群信息,即按照细胞的分群作为分组依据来绘制热图:

DoHeatmap(pbmc3k.final, 
          features = VariableFeatures(pbmc3k.final)[1:30], 
          cells = 1:1000, 
          size = 4, # 分组文字的大小
          angle = 45) +  # 分组文字角度
  NoLegend()

我们用meta.data中的任何列作为分群依据。例如这里的”groups”列:

colnames(pbmc3k.final@meta.data)
[1] "orig.ident"         "nCount_RNA"         "nFeature_RNA"      
[4] "seurat_annotations" "percent.mt"         "RNA_snn_res.0.5"   
[7] "seurat_clusters"    "groups"            
DoHeatmap(pbmc3k.final, 
          features = VariableFeatures(pbmc3k.final)[1:30], 
          group.by = "groups",
          cells = 1:1000, 
          size = 4, # 分组文字的大小
          angle = 0) +  # 分组文字角度
  NoLegend()

2 细胞分群图

DimPlot(pbmc3k.final, reduction = "pca")

DimPlot(pbmc3k.final, reduction = "umap")

进一步修饰

library(ggplot2)
DimPlot(pbmc3k.final, reduction = "umap") + 
  labs(title = "Clustering of 2,700 PBMCs") +
  theme_bw()


R version 4.3.2 (2023-10-31)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Sonoma 14.3

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Asia/Shanghai
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] scCustomize_2.0.1  ggplot2_3.4.4      RColorBrewer_1.1-3 Seurat_5.0.1      
[5] SeuratObject_5.0.1 sp_2.1-2          

loaded via a namespace (and not attached):
  [1] RcppAnnoy_0.0.21            splines_4.3.2              
  [3] later_1.3.2                 bitops_1.0-7               
  [5] tibble_3.2.1                polyclip_1.10-6            
  [7] janitor_2.2.0               fastDummies_1.7.3          
  [9] lifecycle_1.0.4             globals_0.16.2             
 [11] lattice_0.22-5              MASS_7.3-60.0.1            
 [13] magrittr_2.0.3              plotly_4.10.4              
 [15] rmarkdown_2.25              yaml_2.3.8                 
 [17] httpuv_1.6.13               sctransform_0.4.1          
 [19] spam_2.10-0                 spatstat.sparse_3.0-3      
 [21] reticulate_1.34.0           cowplot_1.1.2              
 [23] pbapply_1.7-2               lubridate_1.9.3            
 [25] abind_1.4-5                 zlibbioc_1.48.0            
 [27] Rtsne_0.17                  GenomicRanges_1.54.1       
 [29] purrr_1.0.2                 BiocGenerics_0.48.1        
 [31] RCurl_1.98-1.14             pracma_2.4.4               
 [33] circlize_0.4.16             GenomeInfoDbData_1.2.11    
 [35] IRanges_2.36.0              S4Vectors_0.40.2           
 [37] ggrepel_0.9.5               irlba_2.3.5.1              
 [39] listenv_0.9.0               spatstat.utils_3.0-4       
 [41] goftest_1.2-3               RSpectra_0.16-1            
 [43] spatstat.random_3.2-2       fitdistrplus_1.1-11        
 [45] parallelly_1.36.0           leiden_0.4.3.1             
 [47] codetools_0.2-19            DelayedArray_0.28.0        
 [49] tidyselect_1.2.0            shape_1.4.6                
 [51] farver_2.1.1                matrixStats_1.2.0          
 [53] stats4_4.3.2                spatstat.explore_3.2-5     
 [55] jsonlite_1.8.8              ks_1.14.2                  
 [57] ellipsis_0.3.2              progressr_0.14.0           
 [59] ggridges_0.5.5              survival_3.5-7             
 [61] tools_4.3.2                 ica_1.0-3                  
 [63] Rcpp_1.0.12                 glue_1.7.0                 
 [65] gridExtra_2.3               SparseArray_1.2.3          
 [67] xfun_0.41                   MatrixGenerics_1.14.0      
 [69] GenomeInfoDb_1.38.5         dplyr_1.1.4                
 [71] withr_3.0.0                 fastmap_1.1.1              
 [73] fansi_1.0.6                 digest_0.6.34              
 [75] timechange_0.2.0            R6_2.5.1                   
 [77] mime_0.12                   ggprism_1.0.4              
 [79] colorspace_2.1-0            scattermore_1.2            
 [81] tensor_1.5                  spatstat.data_3.0-4        
 [83] Nebulosa_1.12.0             utf8_1.2.4                 
 [85] tidyr_1.3.0                 generics_0.1.3             
 [87] data.table_1.14.10          httr_1.4.7                 
 [89] htmlwidgets_1.6.4           S4Arrays_1.2.0             
 [91] uwot_0.1.16                 pkgconfig_2.0.3            
 [93] gtable_0.3.4                lmtest_0.9-40              
 [95] SingleCellExperiment_1.24.0 XVector_0.42.0             
 [97] htmltools_0.5.7             dotCall64_1.1-1            
 [99] scales_1.3.0                Biobase_2.62.0             
[101] png_0.1-8                   snakecase_0.11.1           
[103] knitr_1.45                  rstudioapi_0.15.0          
[105] reshape2_1.4.4              nlme_3.1-164               
[107] zoo_1.8-12                  GlobalOptions_0.1.2        
[109] stringr_1.5.1               KernSmooth_2.23-22         
[111] parallel_4.3.2              miniUI_0.1.1.1             
[113] vipor_0.4.7                 ggrastr_1.0.2              
[115] pillar_1.9.0                grid_4.3.2                 
[117] vctrs_0.6.5                 RANN_2.6.1                 
[119] promises_1.2.1              xtable_1.8-4               
[121] cluster_2.1.6               beeswarm_0.4.0             
[123] paletteer_1.5.0             evaluate_0.23              
[125] mvtnorm_1.2-4               cli_3.6.2                  
[127] compiler_4.3.2              rlang_1.1.3                
[129] crayon_1.5.2                future.apply_1.11.1        
[131] labeling_0.4.3              mclust_6.0.1               
[133] rematch2_2.1.2              plyr_1.8.9                 
[135] forcats_1.0.0               ggbeeswarm_0.7.2           
[137] stringi_1.8.3               viridisLite_0.4.2          
[139] deldir_2.0-2                munsell_0.5.0              
[141] lazyeval_0.2.2              spatstat.geom_3.2-7        
[143] Matrix_1.6-5                RcppHNSW_0.5.0             
[145] patchwork_1.2.0             future_1.33.1              
[147] shiny_1.8.0                 SummarizedExperiment_1.32.0
[149] ROCR_1.0-11                 igraph_1.6.0