data.table
is a powerful tool for exploring data. However, how is it fast? Here we provides a performance test for subsetting data.
Code:
1 | library(data.table) |
After above benchmarks, we can see that filter
in dplyr
is fast when data size is low (lower than 10 MB), but data.table searching by key is faster when data size is larger. Fastmatch is not fast. HAHA!! Even data.frame is slower than matrix. data.table
is so worth to learn!
My environment is ubuntu 14.04, R 3.1.1 compiled by intel c++, fortran compiler with MKL. My CPU is 3770K@4.3GHz.