Home on Yun Yan - 严云
https://yany.netlify.com/
Recent content in Home on Yun Yan - 严云Hugo -- gohugo.ioen-usFri, 02 Mar 2018 00:00:00 +0000PC1 and cell size
https://yany.netlify.com/post/2018/03/02/pc1-and-cell-size/
Fri, 02 Mar 2018 00:00:00 +0000https://yany.netlify.com/post/2018/03/02/pc1-and-cell-size/This post is to investigate:
What does the PC1 imply? Why and how to remove the effect of cell-size? suppressPackageStartupMessages(library(Seurat)) suppressPackageStartupMessages(library(autosc)) suppressPackageStartupMessages(library(dplyr)) suppressPackageStartupMessages(library(Matrix)) # ggplot2::theme_set(ggpubr::theme_classic2(base_size = 12)) # umi <- Read10X(data.dir = "~/Downloads/filtered_gene_bc_matrices/hg19/") # umi <- as.matrix(umi) Process using autosc I developed some extension functions on top of Seurat package, and wrapped them to a R package autosc. autosc has more options in terms of normalization, transformation, selecting HVG, and visualization.How to calculate pairwise distance in a slow and fast way
https://yany.netlify.com/post/2018/01/17/how-to-calculate-pairwise-distance-in-a-slow-and-fast-way/
Wed, 17 Jan 2018 00:00:00 +0000https://yany.netlify.com/post/2018/01/17/how-to-calculate-pairwise-distance-in-a-slow-and-fast-way/Question How to calculate the pairwise euclidean distance in R.
dist(mat) But do you know how slow it is to calculate the pairwise euclidean distance in R? And how fast could it be in Python’s scikit-learn? Short answer: 300 faster than basic R.
Basic R: dist(). Rcpp Matrix: matrix multiplication velocity.R’s colEuclid function Data n_row <- 500 n_col <- 3000 mat <- matrix(rpois(n_row * n_col, lambda = 100), nrow = n_row) tmat <- t(mat) Methods Matrix pdist <- function(mat){ # @param mat A non-negative matrix with samples by features # @reference http://r.What do you mean "similarity" (1)
https://yany.netlify.com/post/2018/01/14/hacking-distance-one/
Sun, 14 Jan 2018 00:00:00 +0000https://yany.netlify.com/post/2018/01/14/hacking-distance-one/Question Mathematically speaking, \(||OA|| = ||OB||\) holds true in Fig-1, so there is a common inference that A and B are same. But the distance itself actually means both A and B have same amount of numeric changes relative to O, without indicating the direction of the change. This is in particular a problem what if \(x_1\) and \(x_2\) are two genes driving different biological processes, or two factors suggesting distinct customer preferences.Decode Life by Coding
https://yany.netlify.com/about/
Mon, 01 Jan 0001 00:00:00 +0000https://yany.netlify.com/about/and …
Understanding the properties of data by EDA Proposing hypothesis by brain storm Validating by wet/dry experiments Seeking opinions from peers by reading paper and attending conferences Writing paper Writing codes Getting funded Hmm, so much fun to be a computational biologist.