History

- **How**:
	- Developed by **Ross Ihaka** and **Robert Gentleman** in the early 1990s at the University of Auckland, New Zealand.
	- **Released** in 1995 as an open-source project, inspired by the S programming language.
	- R was specifically created to provide a statistical computing environment with a rich ecosystem for data analysis, visualization, and statistical modeling.
	- The language has become the standard for data science, statistics, and bioinformatics due to its rich set of packages and a strong user community.
	- **Key Milestones**:
		- 1993: R started as an alternative to S, adding object-oriented programming and better extensibility.
		- 2000s: With the rise of big data and data science, R gained popularity as an open-source tool for statistical analysis.
		- 2010s: R solidified its place as a dominant language in the data science ecosystem, thanks to powerful packages like **ggplot2**, **dplyr**, and **tidyr**.
-
- **Who**:
	- R was created by **Ross Ihaka** and **Robert Gentleman**.
	- Maintained and developed by the **R Foundation for Statistical Computing** and contributors from around the world.
	- The language is supported by a vast community of statisticians, data scientists, and academics.
-
- **Why**:
	- R was created to provide a more accessible and powerful tool for statistical analysis compared to existing commercial options.
	- Its emphasis on data manipulation, visualization, and statistical computing made it a perfect fit for research and analysis-heavy fields like **biostatistics**, **epidemiology**, and **econometrics**.
	- R’s open-source nature also made it appealing for academic and research purposes, where cost and flexibility were key factors.
  • Introduction

    • Advantages:

      • Statistical Computing: R excels in statistical analysis and data manipulation, with a large collection of packages for specialized statistical methods, making it a favorite in research and academia.
      • Visualization: R has powerful visualization libraries such as ggplot2 and shiny for creating high-quality, customizable graphics and interactive web applications.
      • Rich Ecosystem of Libraries: R boasts an extensive set of libraries for data manipulation (dplyr, tidyr), modeling (caret, randomForest), and statistical analysis (lme4, MASS).
      • Community and Documentation: The R community is vast, providing an abundance of resources such as tutorials, forums, and an extensive CRAN repository for packages.
      • Reproducibility: Tools like R Markdown and knitr enable reproducible research by embedding R code within reports and presentations.
      • Integration with Other Tools: R can integrate well with other programming languages (e.g., Python, C++), databases (e.g., SQL, Hadoop), and tools (e.g., Excel, Jupyter).
      • Data Handling: R’s data frames and tibbles allow for efficient data manipulation and handling of complex datasets.
    • Disadvantages:

      • Performance: R is generally slower than lower-level languages like C++ or Java, particularly when dealing with very large datasets or computationally intensive operations.
      • Memory Usage: R stores data in memory (RAM), which can lead to inefficiencies when working with large datasets, making it less suitable for big data applications unless using packages like data.table or dplyr.
      • Steep Learning Curve for Newcomers: While R is intuitive for statisticians, its syntax and ecosystem can be difficult for beginners, especially those without a background in statistics or programming.
      • Less Suitable for General-purpose Programming: R is heavily focused on data science and statistics, making it less versatile than languages like Python for general-purpose software development.
      • Limited GUI Support: While RStudio is a popular IDE for R development, R’s support for graphical user interfaces (GUIs) for building applications is not as strong as other languages like Python or JavaScript.
    • Remember Points:

      • Statistical Power: R is built to handle complex statistical computations and is one of the most robust tools for data analysis and modeling.
      • Visualization: R offers extensive capabilities for data visualization, with libraries like ggplot2 allowing for highly customizable plots.
      • Rich Ecosystem: R’s ecosystem of packages makes it an ideal tool for tasks ranging from data cleaning and manipulation to advanced machine learning and statistical modeling.
      • Open Source: R is free and open-source, which makes it widely accessible for individuals and organizations working in research and data science.
      • Reproducibility: With R tools like R Markdown, the language emphasizes reproducible research, enabling you to create reports with embedded code and results that can be shared or published.