Most people using LaTex feel that creating tables is no fun. Some days ago I stumbled across a neat function written by Paul Johnson that produces LaTex code as well as LaTex code that can be used within Lyx. The output can be used for regression models and looks like output from the Stata outreg command. His R function that produces the LaTex code has the same name: outreg(). The outreg code can be found on his website or in the PDF copy of the code from his website.
I took the code, put it into a .rnw file and sweaved it. It worked like a charm and produced beautiful results (see the picture on the left and the PDF). Below you can find the code for the noweb file (.rnw). Latex code is colored grey, R-code is colored blue. Just have a look at all the results as a PDF file. Besides, Paul Johnson has also created a nice list of R-Tips that can be found on his website as well.
Continue reading ‘R: Function to create tables in LaTex or Lyx to display regression model results’
Filed under: R / R-Code | 2 Comments
Tags: LaTex, Lyx, regression, sweave, tables
Just a little note for german speaking R beginners: There is an introductory course in R (german) available online on the website of the department of methodology and evaluation research at the University of Jena. Dr. Ivailo Partchev holds a seven sessions course on that topic (duration 11.5 hours).
Filed under: News | Leave a Comment
Before you read this post, please have a look at Enrique’s comment below. He pointed out that the built-in R function modifyList() already does what I wanted to describe in this post. Well, I live to learn :)
I was wondering how I could write a function that uses default settings but accepts a list to overwrite the default settings via the dot-dot-dot / three-point argument. Here comes my solution.
# building a function with a list of default settings
# that can be modified by an optional list passed
# via the dot-dot-dot / three point argument
Filed under: R / R-Code | 4 Comments
Tags: building functions, dot-dot-dot
On the REvolutions Blog there is a nice posting treating the often raised concern on “How good or reliable R is”. At my university R is hardly used. Sometimes I was asked by lecturers wether the calculations done by R and its packages are accurate. The linked posting treats this matter and tries to clarify this point.
Filed under: R / R-Code | Leave a Comment
![]()
Sometimes I find it useful to merge two data frames like the following ones
X1 X2 X3 X4 Y1 Y2 Y3 Y4 1 o o o o X X X X 2 o o o o X X X X 3 o o o o X X X X
by using zip feeding either along the columns
X1 Y1 X2 Y2 X3 Y3 X4 Y4 1 o X o X o X o X 2 o X o X o X o X 3 o X o X o X o X
or along the rows of the data frames.
V1 V2 V3 V4 1 o o o o 4 X X X X 2 o o o o 5 X X X X 3 o o o o 6 X X X X
Filed under: R / R-Code | 1 Comment
Tags: combine, data frame, merge, zip fastener
In some statistical programs there is the option available to attach a footnote to the graphical output that is created. This footnote may contain the name of the script or the file that produced the graphic, the author’s name and the date of creation. In SAS for example there is a footnote command to achieve this. Ever since I realized that this makes life a lot easier, I wrote a simple three-lines function in R which I use at the end of the construction of any graphic. I suppose, that this is what my professors meant with “good practice”. The nice thing about implementing this in the grid graphics system is that you can produce multiple graphics [e.g. by par(mfrow=c(2, 2))] and still the footnote will be positioned correctly.
Continue reading ‘R: Good practice – adding footnotes to graphics’
Filed under: R / R-Code | Leave a Comment
Tags: footnote, graphics
Although the graphic at the left might not seem a 100% appropriate, it gives a hint to what I am about to do. I want to calculate all possible linear regression models with one dependent and several independent variables. I do not want to address bias and fitting issues or the question if this makes sense from a statistical point of view in this posting. Here I want to emphasize the technical issues only.
To solve the task, several approaches are possible. The first one is a step-by-step approach using a lot of code. Another one would be to make use of a specialized package. The packages leaps and meifly would be appropriate for the task but have some slight drawbacks in terms of flexibility. I will not address solutions using these packages here, but I would like to point out that in contrast to the below only a few lines of code would do the job.
The step-by-step approach
Let’s suppose we have the following set of four possible regressors.
regressors <- c("y1", "y2", "y3", "y4")
Now we want to construct a formula that contains the first and third regressor.
vec <- c(T, F, T, F)
paste(regressors[vec])
> [1] "y2" "y3"
So the paste commmand works vectorwise which helps a lot in this case. Now we add a plus sign between the regressors…
Filed under: R / R-Code | 1 Comment
Tags: permutation, plyr, regression
Today I will treat a problem I encounter every once in a while. Let’s suppose we have several dataframes or vectors of unequel length but with partly matching column names, just like the following ones:
df1 <- data.frame(Intercept = .4, x1=.4, x2=.2, x3=.7) df2 <- data.frame(Intercept= .5, x2=.8 )
This for example may occur when fitting several multiple regression models each time using different combination of regressors. Now I would like to combine the results into one data frame. The merge() as well as the rbind() function do not help here as they require equal lengths.
I posted this matter on r-help as my first solution was somewhat awkward and could not be generalized to any data frames or list of data frames. The first solution was posted by Charles C. Berry. myList is a list containing the data frames as elements
myList <- list(df1, df2)
What he does is to use a nested loop. The inner loop runs for each data frame over each column name. It basically takes each column name and the correponding element [i, j] from the data frame ( myList[[i]] ) and writes it into an empty data frame (dat). Thereby a new column that is named just like the column from the list element data frame is created. The cells that are left out are automatically set NA.
dat <- data.frame()
for(i in seq(along=myList)) for(j in names(myList[[i]]))
dat[i,j] <- myList[[i]][j]
dat
Continue reading ‘R: Combining vectors or data frames of unequal length into one data frame’
Filed under: R / R-Code | 2 Comments
Tags: data frame, plyr
After my last posting on how to extract the google number count I was searching the web and found a nice website allowing you to calculate many semantic relatedness measures. On request it seems to be possible to get free access to their API. The API allows you to post a request via the GET or POST method which can be implemented in R using the RCurl package.
Anyway, I will post the code to do the normalized google distance (NGD) calculation using R only. As last time the code for the google count extration implemented in R was posted as a first step, here comes the second step, the calculation, using the function described last time.
The calculation formula might look a bit scary at a first glance:
![]()
Looking at its step-by-step development in the article Automatic Meaning Discovery Using Google it gets quite easy to understand the rationale behind it. What we need to know here is that M is the total number of web pages searched by Google. f(x) and f(y) are the counts for search terms x and y, respectively. f(x, y) is the number of web pages found on which both x and y occur (also see Wikipedia). So the ingredients are clear. Here comes the function.
Continue reading ‘R: Normalized Google distance (NGD) in R part II’
Filed under: R / R-Code | Leave a Comment
Tags: NGD, normalized google distance
Recent Entries
- R: Function to create tables in LaTex or Lyx to display regression model results
- Getting started with R (for german speakers)
- R: Building functions – using default settings that can be modified via the dot-dot-dot / three point argument
- How accurate or reliable are R calculations?
- R: Zip fastener for two data frames / combining rows or columns of two dataframes in an alternating manner
- R: Monitoring the function progress with a progress bar
- R: Good practice – adding footnotes to graphics
- R: Calculating all possible linear regression models for a given set of predictors
- R: Combining vectors or data frames of unequal length into one data frame
- R: Normalized Google distance (NGD) in R part II
- R: Retrieving information from google using the RCurl package
Categories
- News (3)
- R / R-Code (13)


All postings