Wednesday 10 January 2018

Introduction to R - Part 1 : R Basics

Introduction to R



So everyone is talking about R these days and seems to be one of the hot topics right now. So I thought I will share some knowledge on what I have learnt about R. As I mentioned before in one of my previous posts Application Engine has the ability to Call R code for Analytics.

So let's first look at the Basics of R and then see how we can use it in Infor BI.

For the demonstrations I have downloaded the latest R version for Windows from here

And the development tool for R which is R Studio can be downloaded from here.

Installation is pretty simple just go ahead with the defaults and you are good to go.

Below I have opened R Studio in my local environment.



R: The true basics

What is R ? R is the Language for Statistical Computing, 

Developed by Ross Ihaka and Robert Gentleman at the University of Auckland in the 90's. 
It is an open source implementation of the S language, which was developed by John Chambers in the 80's. R provides a wide variety of statistical techniques and visualisation capabilities. Another very important feature about R is that it is highly extensible.

Advantages 
  • Open Source and Free
  • Top notch Graphical capabilities
  • Command Line Interface
  • Reproducible R scripts
  • R Packages available 
  • Community help

Disadvantages 
  • Easy to learn hard to Master
  • Command Line Interface may be daunting at first
  • Poorly written R code may be hard to read and maintain
  • Poorly written code is slow
Ok Let's get started..

On the R studio console lets try the below code

1 + 2 (Hit enter)

You will get the result 3


Variables

You can assign values to variables using <- (Less than and hyphen sign) 

Height <- 5
Width <- 10

You also can do arithmetic operations with the variables.

Eg: Height * Width  will result 50

The list of variables could be checked by using the ls() function.

You can also assign the result of multiplying 2 variables to another variable named Area.

Area <- Height * Width

Now if we run the ls() function it will show Height , Width and Area as available variables.

What happens when we try to access a non existing variable. The console will return an error.


It is important that you clean up your work space when you are done with the code.

For this you can use the function rm()

Eg: removing Height from the work space.

>  rm(Height)

Now if you look at the variables you will only see Area and Width.

> ls()
[1] "Area"  "Width"

Comments
You use the # sign to enter comments. These comments will not be executed by the console.

Basic data types
Data types are also called Atomic Vendor types in R. 

Before we move on with the data types please note the function class(). This is a very useful function to see what type a variable is.

Logical (TRUE | FALSE | NA)


As you can see you can also use T for TRUE and F for FALSE. However it is strongly advised to use the full versions.

Numeric



You will see the difference between the integer 2 and the numeric 2 from the output.

You can use the is. functions to check if the variable is a certain data type.


As you can see integers are numbers but all numbers are not integers

Characters



Other atomic types
double for higher precision numbers
complex for handling complex numbers
raw to store raw bytes. However,


Coercion

Converting one data type to another using the as. function.


Important
TRUE/FALSE is converted to the numeric 1 & 2
Numeric 4 is converted to character "4"
Character "4.5" converted to numeric 4.5

You can even convert this character string, "4.5", to an integer, but this will result in some
information loss, because you cannot keep the decimal part here.

You also need to be aware that not all data conversions are possible. 
Eg: converting "Hello" in to an integer results in NA and a warning message.

Next post I will be talking about Vectors

No comments:

Post a Comment

Blog Archive