箱線圖 | 小提琴圖,如何用合适的圖形來展示數據,下面通過2個小例子來系統介紹一下
package.list=c("tidyverse","ggsci","ggsignif")
for (package in package.list) {
if (!require(package,character.only=T, quietly=T)) {
install.packages(package)
library(package, character.only=T)
}
}
準備數據集,在此我們使用ToothGrowth數據集
ToothGrowth %>% as_tibble()
# A tibble: 60 x 3
len supp dose
<dbl> <fct> <dbl>
1 4.2 VC 0.5
2 11.5 VC 0.5
3 7.3 VC 0.5
4 5.8 VC 0.5
5 6.4 VC 0.5
6 10 VC 0.5
7 11.2 VC 0.5
8 11.2 VC 0.5
9 5.2 VC 0.5
10 7 VC 0.5
# ... with 50 more rows
注:使用 ctrl shift M可快速打出%>%
下面我們先來做一個簡單的箱線圖
ToothGrowth %>% mutate(dose=as.factor(dose)) %>%
ggplot(aes(dose,len,fill=supp))
geom_boxplot()
通常繪制箱線圖最好添加上誤差線,可通過stat_boxplot添加
ToothGrowth %>% mutate(dose=as.factor(dose)) %>%
ggplot(aes(dose,len,fill=supp))
geom_boxplot(position = position_dodge(0.7),
width = 0.5,show.legend = T,alpha=0.8)
stat_boxplot(geom="errorbar",position=position_dodge(width=0.7),width=0.1,alpha=0.8)
也可以根據does | supp類型進行分面展示
ToothGrowth %>% mutate(dose=as.factor(dose)) %>%
ggplot(aes(dose,len,fill=supp))
geom_boxplot(position = position_dodge(0.7),
width = 0.5,show.legend = T,alpha=0.8)
stat_boxplot(geom="errorbar",position=position_dodge(width=0.7),width=0.1,alpha=0.8)
facet_wrap(.~supp,scales = "free")
scale_fill_jco()
後續就是一些對圖例和主題的調整,可以參考之前的文檔 ggplot2修飾圖例的那些事
下面讓我們通過小提琴的形式來重新展示數據
ToothGrowth %>% mutate(dose=as.factor(dose)) %>%
ggplot(aes(dose,len,fill=supp))
geom_violin(position = position_dodge(0.7),trim = FALSE,alpha=0.8)
geom_boxplot(position = position_dodge(0.7),
width = 0.15,show.legend = F,alpha=0.8,color="white")
stat_boxplot(geom="errorbar",position=position_dodge(width=0.7),width=0.1,alpha=0.8,color="white")
facet_wrap(.~supp,scales = "free")
scale_fill_jco()
theme_bw()
可以看到小提琴圖明顯美觀很多,同時我們也可以通過ggsignif添加上分組統計信息
ToothGrowth %>% mutate(dose=as.factor(dose)) %>%
ggplot(aes(dose,len,fill=supp))
geom_violin(position = position_dodge(0.7),trim = FALSE,alpha=0.8)
geom_boxplot(position = position_dodge(0.7),
width = 0.15,show.legend = F,alpha=0.8,color="white")
stat_boxplot(geom="errorbar",position=position_dodge(width=0.7),width=0.1,alpha=0.8,color="white")
geom_signif(comparisons = list(c("0.5","1"),
c("0.5","2"),
c("1","2")),
map_signif_level=T,vjust=0.5,color="black",
textsize=5,test=wilcox.test,step_increase=0.1)
facet_wrap(.~supp,scales = "free")
scale_fill_jco()
theme_bw()
theme(panel.spacing.x = unit(0.2,"cm"),
panel.spacing.y = unit(0.1, "cm"),
axis.title = element_blank(),
strip.text.x = element_text(size=9,color="black"),
strip.background.x = element_blank(),
axis.text = element_text(color="black"),
axis.ticks.x=element_blank(),
legend.text = element_text(color="black",size=9),
legend.title=element_blank(),
legend.spacing.x=unit(0.1,'cm'),
legend.key=element_blank(),
legend.key.width=unit(0.5,'cm'),
legend.key.height=unit(0.5,'cm'),
legend.position = "non",
plot.margin=unit(c(0.3,0.3,0.3,0.3),units=,"cm"))
通過一系列主題調整圖形之後,終于有了一定的美感;但是這還遠遠不夠,針對多組數據我們還有更好的數據可視化形式,下一節帶大家繼續探索。
歡迎關注我的公衆号R語言數據分析指南,下回更新不迷路,
,更多精彩资讯请关注tft每日頭條,我们将持续为您更新最新资讯!