# Lesson 1

### Introduction

##### Getting started

Version March 2018

This script provides a concise introduction to basic functionalities of R.

R (and for that matter any programming language) is hard to grasp at first. But the learning curve is steep. Moreover, there are a multitude of free resources available online that provide guidance and support to learn on your own. Also disclaimer: I am law professor, not a programmer. So other resources will be better at teaching you how to program. But if you are interested in leveraging data science for law in the programming environment of R, keep on reading.

## Setting up R and RStudio

##### Basic calculations in R
```# At its very basic, R is a calculator. You type in an equation and it returns the answer.

1+1
```
```##  2
```
```16/4
```
```##  4
```
```6^8.5
```
```##  4114202
```
```# As discussed above, you can comment on your code in script using #.
# This way, R will understand that this part of your script is a comment and will only execute your actual code.

# But you can also store information you have created.
# In the upper right hand corner of RStudio you see all variables you have created in that session.

x<-1
y<-2+2

# To print the information you have stored you can either type in the name of the variable or do print().
```
```y
```
```##  4
```
```print(y)
```
```##  4
```
```# You can perform operations on the variables you have created.
```
```x+y
```
```##  5
```
```# We call these variables you create OBEJECTs. You have almost complete liberty to name your object.
silly_name<-5+3
silly_name
```
`##  8`
##### Object classes

What makes R powerful is that you can not only work with numbers but also with other types of data.

More specifically there are three types of data forms that we will use.

• numerical data e.g. 1; 67; 5.56541
• logical data i.e. TRUE, FALSE
• character data e.g. "Hello World"
```                            # To determine the type of data you simply ask class().
class
(silly_name)
```
`##  "numeric"`
```                            another_silly_name
<-
"Hello World"
class
(another_silly_name)
```
`##  "character"`
```                            # Commands like class() or print() are FUNCTIONS. You can perform functions on R objects.
# Whenever you do not know what a function does you can ask using "?".
?
class
()
# Up to now, we have been dealing with single values: one integer or one string. You can aggregate these values into VECTORS.
# To do so you aggregate values with c().
numeric_vector
<-
c
(
1
,
2
,
3
,
4
,
5
)
numeric_vector
```
`##  1 2 3 4 5`
```                            # A more efficient way to create the same vector would be:
numeric_vector
<-
c
(
1
:
5
)
numeric_vector
```
`##  1 2 3 4 5`
```                            # You can create vectors with character strings, too.
character_vector
<-
c
(
"Days"
,
"Months"
,
"Year"
)
character_vector
```
`##  "Days"   "Months" "Year"`
```                            # In turn, vectors can be aggregated into MATRICES and DATAFRAMES.
# They are essentially tables.
# Matrices have to be of the same data type whereas dataframes can combine different data types.
# We will work mostly with dataframes.
# For instance, we may want to create a dataframe that collects information on US Presidents for a given year.
# We can create two vectors and combine them into a dataframe.
Years
<-
c
(
2015
,
2016
,
2017
,
2018
)
US_Presidents
<-
c
(
"Obama"
,
"Obama"
,
"Trump"
,
"Trump"
)
Pres_data
<-
data.frame
(Years, US_Presidents,
stringsAsFactors
=
FALSE
)
```

### Exercices

There are a couple of sample dataframes in R. As an exercise, we will work with some of them. We will also train your “help yourself” instincts. There is ample help on the web. Platforms like

Let’s give it a try.

Example 1)

```                            # Load the data set on USA Arrest rates.
data
(
"USArrests"
)
```

Answer the following questions (with the help of online resources).

1. Sort the data by the Murder rate. Which state has the highest murder rate?
2. What is the average murder rate across all states?
3. What is the correlation between urban population and murder rates?

Example 2)

```                            # Let's look at some data on judges.
data
(
"USJudgeRatings"
)
```

1. What judge has the highest overall rating?
2. Which category is the highest rated overall?

Example 3)

```                            # Finally, let's take a look at Canada and the Canadian lynx dataset.
data
(
"lynx"
)
```

1. Plot the number of lynx hunted every year.
2. Try different plot types. Which visualization is most appropriate?

### Dataset

Mirum est notare quam littera gothica, quam nunc putamus parum claram, anteposuerit
litterarum formas humanitatis per seacula quarta decima et quinta decima. Eodem modo
typi, qui nunc nobis videntur parum clari, fiant sollemnes in futurum. Claritas est
etiam processus dynamicus, qui sequitur mutationem consuetudium lectorum eleifend option
congue nihil imperdiet.