Thursday 11 January 2018

Introduction to R - Part 2 : Vectors

Vectors in R
In this post I will be talking about 3 topics on vectors
  1. Creating and naming vectors
  2. Vector Arithmetic's
  3. Vector Sub-setting
What is a vector


A vector is a sequence of data elements of the same basic data type. Remember the atomic vector types we discussed before? 

Creating and naming vectors
You can create a vector using the c() function. And You can check if a variable is the type of Vector by using is.Vector() function

Try the below commands

vector_poker <- c(12,13,20,21) # Creating a Vector
is.vector(vector_poker) # Checking if the vector_poker is a Vector
is.vector(Area) # Checking if Area is a Vector
class(vector_poker) # Checking the class of the vector_poker



Lets create a Vector with characters.

vector_name <- c("spades" , "hearts" , "diamonds" , "clubs")
So we now have 2 vectors. One with numbers and one with strings.

What if we want to label our numeric Vector with the names in the 2nd vector.
You can simply use the names() function to do this.

names(vector_poker ) <- vector_name 



And now as you can see out vector_poker has names as well as values.

There are different ways to skin a cat. So see the below examples of how to get the same outcome.

vector_option_1 <- c(12,13,20,21) 
vector_name <- c("spades" , "hearts" , "diamonds" , "clubs")
names(vector_option_1) <- vector_name 

vector_option_2 <- c(spades = 12, hearts = 13, diamonds = 20, clubs = 21) 

vector_option_1 <- c("spades" = 12, "hearts" = 13, "diamonds" = 20, "clubs" = 21) 

All 3 options will give you the same output.

Now we know that R vectors have attributes associated with them. When we set the names to the vector we actually set the names attribute of the vector.

You can use the str() function to display the structure of an R object.


Another Important thing to remember is the length() function. Let's try it out.


As the output shows Area and text are vectors with the length 1. And the poker vector shows 4 as the length.

Important
The last important thing is that in R, a vector can only have elements of the same type. They are
also often called "Atomic vectors", to differentiate them from "lists", (List is a data structure
which can hold elements of different data types). This means that you cannot have a vector that
contains both logical data type and numbers, 

If we try to build such a vector, R automatically performs coercion to make sure that you have a vector that contains elements of the same type. Let's see how that works that works

Coercion of Vectors
vector_ranks <- c(9,4,"B" , 11 , "J" , 4, 34 , "R")

In this example there are numeric as well as character elements.

Output 
> vector_ranks
[1] "9"  "4"  "B"  "11" "J"  "4"  "34" "R" 

Now lets check the class of this vector.

> class(vector_ranks)
[1] "character"

Be careful when you have such instances in your code as it could be dangerous. For these kind of scenarios you should probably use a list. We will discuss later how to use a list.


Vector Arithmetic
We already did some arithmetic calculations with Vectors.

Remember Area <- Height * Width

But Let's have a look at Arithmetic calculations for vectors with more than 1 element.

My friends like to play poker on weekends so I thought I will give you an example related to Poker :D. Now don't judge me alright !!!!

Let's say I want to record my gambling earnings for the past 3 days.

earnings <-  c(100, 500 , 200) # Btw this is in US $ 

My good friend Mark likes to play it risk. So he tells me , I will triple your earnings if you manage to beat me in Poker today. Hmmmm I thought about if for a while and I knew he was playing mind games because he had his poker face on. 

But I decided life is too short might as well take a risk LOL. And you know what I won.

So now my total earnings can be calculated like this.

earnings * 3


As you can see it has tripled my earnings for each element.

The same way we can use division , subtraction ,addition and many more can be done. And it will treat the operation for all the elements in the vector.


OK I have to be honest. I did also lose money during the past 3 days.

expenses <- c(50,200,300)
So now I'm gonna calculate my profit.
I can simply say earnings - expenses

> profit <- (earnings*3) - expenses
> profit
[1]  250 1300  300

Last day was terrible I ended up losing 100$ 😐

Now I am going to calculate the sum of my earnings.

> sum(profit)
[1] 1850

Yaaaay I've earned 1850$ not bad eh.
One last thing on Vectors.You can compare 2 vectors using logical operators.

Eg: 
> earnings > expenses
[1]  TRUE  TRUE FALSE

This statement compares each element to see if each earning is greater than the corresponding expense in the expense vector. So now I know the last day I have spent more than what I have earned.

Vector Sub-setting
Subset is selecting parts of an existing vector and creating a new vector.

Subset by Index
You can specify which index you want to output by simply defining the index inside [] (square brackets)

vector_cards
  spades   hearts diamonds    clubs
      11       12       16       13
vector_cards[1]
spades

    11  # The result is a vector with one element

Subset by Name
Instead of using the index you can use the name to specify the element you need.

vector_cards["spades"]
spades
    11

Subset multiple elements
Say you want select spades and clubs then you can simply specify the indices you want to select.
> vector_cards[c(1,4)]
spades  clubs

    11     13

Lets Assign the result to a  new vector called vector_Selection
> vector_selection <- vector_cards[c(1,4)]
> vector_selection
spades  clubs

    11     13

Note : The order depends on the order of the indices
> vector_selection <- vector_cards[c(4,1)]
> vector_selection
 clubs spades

    13     11

You can also use the labels to specify the elements.
> vector_cards[c("clubs","spades")]
 clubs spades

    13     11

Subset all but some
To leave out the first index you can specify the below code.

> vector_cards[-1]
  hearts diamonds    clubs

      12       16       13

You can also remove multiple elements
> vector_cards[-c(1,2)]
diamonds    clubs

      16       13

Note : - operator does not work with names
> vector_cards[-"spades"]

Error in -"spades" : invalid argument to unary operator

Subset using logical vector

The elements with corresponding value TRUE will be kept and FALSE will be removed.

> vector_names[c(TRUE,TRUE,TRUE,FALSE)]
  spades   hearts diamonds

      11       12       16

So what would happen if you specify logical operators less than the total number of elements in the vector. Will that throw an error. Hmmmmmm ....

Well R performs something called recycling..

Lets try using only 2 logical operators.

> vector_names[c(TRUE,FALSE)]
  spades diamonds
      11       16
> vector_names
  spades   hearts diamonds    clubs
      11       12       16       13

As you can see it repeats the contents of the vector until it has the same length

Lets try with 3 logical operators now.

> vector_names[c(TRUE,FALSE,TRUE)]
  spades diamonds    clubs

      11       16       13

Even if we use a vector of length 3 to do the selection , the vector is recycled to end up with a vector of length 4. Which results in appending the first element again.

So behind the scenes the below line of code is executed.

> vector_names[c(TRUE,FALSE,TRUE,TRUE)]
  spades diamonds    clubs

      11       16       13

Woah that's too much for a post isn't it. I think I'm gonna grab a pizza and chillax for the rest of the day.

Next post I will be talking about Matrices.  Stay tuned folks.....




No comments:

Post a Comment

Blog Archive