You are currently viewing How to Import Data in R Programming?

How to Import Data in R Programming?

Importing data refers to the simple act of bringing the data from an external source to the current program. Here, we will see how to import data into the R programming language.

Import Data in R

Roughly there are four ways to import data into R programming. 

Using Read Functions

R has several built-in functions to read data from various file formats such as CSV, Excel, text, and more. These functions include read.csv(), read.table(), read.xlsx(), read.delim(), readLines(), and more. We can use these functions to import data into R from a file stored on our computer.

Example:

# Importing data from a CSV file 

brainalyst <- read.csv("path/to/brainalyst.csv")

Using External Packages

Several R packages can import data from different sources such as databases, APIs, web pages, and more. For example, the httr package can import data from an API, while the RODBC package can import data from a database.

Example:

# Importing data from an API using the httr package library(httr) 

url <- "https://api.example.com/brainalyst" 

response <- GET(URL) 

brainalyst <- content(response, as="text")

Using Clipboard

We can copy the data to the clipboard and use the read.table() function with the clipboard argument to import the data.

Example:

# Importing data from the clipboard 

brainalyst <- read.table("clipboard", header=TRUE)

Using Manual Data Entry

We can enter data manually into R using the c() function for vectors or the data.frame() function for data frames.

Example:

# Creating a data frame by manually entering data 

brainalyst <- data.frame(id=c(1, 2, 3), name=c("John", "Jane", "Bob"), age=c(25, 30, 45))

Now let us individually check how to import some of the most commonly used files into the R programming language.

You may also like to read: What is R Programming in Data Science?

Import CSV File in R

We can use the read.csv() function to import a CSV file into R. Let’s discuss with an example for better understanding:

  • Ensure the working directory is set to the CSV file’s folder. We can use the setwd() function to set the working directory.
# Set the working directory

setwd("C:/path/to/w/folder")

  • Load the CSV file: Use the read.csv() function to load the CSV file into R. File path and name as an argument for this function must be provided.
# Load the CSV file

brainalyst <- read.csv("brainalyst.csv")

This will create a data frame called brainalyst in R with the data from the CSV file provided.

The read.csv() function also has optional arguments for customizing the import process. For example, Using the header argument, we can determine whether the file contains a header row or if there are missing values using the na.strings argument.

For example:

# Load the CSV file with a header row and missing values 

brainalyst <- read.csv("brainalyst.csv", header=TRUE, na.strings=c("NA", ""))

This will create a data frame with a header row and treat “NA” and empty values as missing values.

In case the CSV file is located on a remote server or a URL, we can use the read.csv() function with a URL as an argument. 

For example:

# Load a CSV file from a URL 

brainalyst <- read.csv("https://example.com/brainalyst.csv")

Import XML File in R

We need to use the XML package to import an XML file into R, which provides functions to parse XML documents and further process them. 

Lets see how we can import an XML file in R:

  • Start with installing the XML package using the install.packages() function. install.packages(“XML”) 
  • Load the XML package using the library() function.
library(XML) 
  • Use the xmlParse() function to parse the XML file and convert it into an R object.
myxml <- xmlParse("path/to/myfile.xml") 

It will read the XML file from the specified path and store it in the myxml object.

Let us go through various other functions provided by the XML package to extract information from the XML file. 

For example, we can use the xmlToList() function to convert the XML file into a list, and the xpathSApply() function to extract specific elements from the XML file based on their path.

# Convert the XML file into a list 

mylist <- xmlToList(myxml) # Extract specific elements from the XML file 

titles <- xpathSApply(myxml, "//books/title", xmlValue) 

authors <- xpathSApply(myxml, "//books/author", xmlValue)

In this example, the xpathSApply() function extracts all the books’ titles and authors in the XML file.

Import JSON File in R

jsonlite package imports a JSON file in R programming. It provides reading and writing JSON data functions. Here’s an example of it – 

  • Install jsonlite package:
 install.packages("jsonlite") 

  • Load jsonlite package:
library(jsonlite) 

  • Import JSON file using the from JSON() function:
Importing JSON data from a file brainalyst <- from JSON("path/to/brainalyst.json") 

The from JSON() function reads a JSON file and returns a data frame or list object in R based on the structure of the JSON data.

If the JSON file is stored in a URL or web page, we can directly read the data using the fromJSON() function with the URL as the argument. 

For example:

# Importing JSON data from a URL 

brainalyst <- from JSON("http://example.com/brainalyst.json")

Once the JSON data is imported into R, we can manipulate and analyze the data using R’s data manipulation and analysis functions.

Import SPSS File to R

To import an SPSS file in R, we need to install and load the foreign package. The foreign package provides functions to read and write data files from other statistical packages, including SPSS.

Here are the steps to import an SPSS file in R:

Install and load the foreign package:

# to_install library(foreign)# to_load
 install.packages("foreign")  
  • Use the read.spss() function to import the SPSS file. We need to specify the file path and name, and set the to.data.frame argument to TRUE to return a data frame:
brainalyst <- read.spss("path/to/brainalyst.sav", to.data.frame=TRUE) 
  • If were SPSS file has non-ASCII characters in variable names or labels, we may need to set the use.missings argument to FALSE to prevent the conversion of these characters to missing values.
brainalyst <- read.spss("path/to/brainalyst.sav", to.data.frame=TRUE, use.missings=FALSE) 
  • If we encounter errors while importing the SPSS file, we may need to use the encoding argument to specify the file’s encoding. The default encoding for SPSS files is “UTF-8”.
brainalyst <- read.spss("path/to/brainalyst.sav", to.data.frame=TRUE, encoding="UTF-8") 

After importing the SPSS file, we can analyze and manipulate the data using R’s functions.

Import txt File to R

We can use the read to import a text file in the R programming.table() function. The read.table() function is versatile and can read data from various file formats, including text files. Here’s an example of how to import a text file in R:

Suppose we have a text file named “data.txt” with the following content:

  • id   
  • name age
  • 1 nitish 22 
  • 2 nitin 29 
  • 3 ramesh 35 

We can import this text file in R using the read.table() functions as follows:

# Importing a text file in R 

brainalyst <- read.table("data.txt", header=TRUE)

In the above example, we used the read.table() function to import the data from the “data.txt” file. The header=TRUE argument tells R to use the file’s first row as the column name. We can omit this argument if the file does not have a header row.

The read.table() function returns a data frame assigned to the brainalyst variable in this example. We can use various data manipulation functions in R to analyze and manipulate this data frame.

Import Data in R from Excel

To import an Excel file in R programming, we can use the readxl package, which provides a set of functions to read Excel files. Here are the steps to import an Excel file in R:

  • Install the readxl package using the following command:
install.packages("readxl") 
  • Load the readxl package using the library() function:
library(readxl) 
  • Use the read_excel() function to read the Excel file into R. The function takes the file path as the input and returns a data frame. We can specify the sheet name or index, start and end rows and columns, and more using the optional arguments.
brainalyst <- read_excel("path/to/brainalyst.xlsx", sheet = "Sheet1") 

In the above example, brainalyst is the data frame’s name that will store the imported data, “path/to/brainalyst.xlsx” is the path to the Excel file, and “Sheet1” is the name of the sheet that we want to import. The function will import the first sheet in the file if the name is not specified.

If we have an older version of Excel, we may need to install the xlsx package instead of readxl and use the read.xlsx() function to import the file.

Leave a Reply