Manipulate Connections in ‘R’ Language

Like in most other languages R too provides some interfaces to read data and they are called connection interfaces. These connections can be made to different things. Lets try to understand few connections.

  • file – make connection to a file
  • url – make connection to url
  • gzfile, bzfile – make connection to compressed files

File Connection

In R function we need to deal with lot of arguments. But most of the function name and arguments are very straightforward. If you want to a get clear understanding of the arguments you can checkout in the manuals and it’s documented clearly.

In my ‘Read/Write Data into ‘R’ Language’ post I was talking about some useful essential functions where we didn’t use any file connection code. Because we don’t need to deal with connections directly as many functions have implemented to run inside them. So when we are reading and writing a file we don’t need to think much about it.

> data <- read.csv(“test.txt”)

Lets look at the below code and try to understand to create a connection with the file.

> con <- file("test.txt")> open(con, "r") ## Open connection to test.txt' in read-only mode
> data <- read.csv(con) ## Read from the connection
> close(con) ## Close the connection

URL Connection

The readLines() function is a useful once you have made the connection with the data location. Therefore, after creating the URL connection using readLines() function we will be able to read the webpages line by line.

> con <- url("http://renien.com/blog/hello-r-world/", "r") ## open the URL connection 
> lines <- readLines(con) ## read the webpage line by line
> head(lines) ## print few lines 
[1] "<!doctype html>"                                                                         
[2] "<!--[if lt IE 7]><html class=\"no-js lt-ie9 lt-ie8 lt-ie7\" lang=\"en\"> <![endif]-->"   
[3] "<!--[if (IE 7)&!(IEMobile)]><html class=\"no-js lt-ie9 lt-ie8\" lang=\"en\"><![endif]-->"
[4] "<!--[if (IE 8)&!(IEMobile)]><html class=\"no-js lt-ie9\" lang=\"en\"><![endif]-->"       
[5] "<!--[if gt IE 8]><!--> <html class=\"no-js\" lang=\"en\"><!--<![endif]-->"               
[6] "<head>"  

Connection to Compressed

To open .gz file we can use gzfile() interface.

> con <- gzfile(test.gz) ## open connection to compressed file

Once we open the compressed file using the connection interface we can use readLines() function to read the content line by line from the text files.

> lines <- readLines(con,2)
[1] Renien
[2] Joseph

Blog Series