Helpful Resources for Workshops

Installations

Tutorials for R

  • Swirl is an excellent tutorial for learning R interactively in RStudio. Highly recommended.

Data science examples and technology landscape

Data at scale

Databases and the relational algebra

MapReduce, Hadoop, relationship to databases, algorithms, extensions, language; key-value stores and NoSQL; tradeoffs of SQL and NoSQL Readings

Data cleaning, entity resolution, data integration, information extraction

Machine Learning resources

Visualization and communicating results