### R vs. Matlab – a small example

15Feb10

At the institute I’m working quite a lot of people prefer using Matlab and only a few of them know about R. Today one of my colleagues — who is also an eager user of Matlab — ran into the following problem:

• He had a vector $v$ in hand which consisted of $\frac{n(n+1)}{2}$ elements.
• He wanted to reshape this data into an n×n matrix $M$, where the element $M_{ij}$ is equal to $v_{k+j}I(j<=n-i+1)$ with $k=\frac{(2n-i+2)(i-1)}{2}$ and $I(j <= n-i+1)=1$ if the condition $j <= n-i+1$ is satisfied and $0$ otherwise. In other words, the first $(n-i+1)$th element of the $i$th row of $M$ is equal to the vector $(v_{k+1},v_{k+2},\ldots,v_{k+n-i+1})$ and the remaining elements are zero.

He struggled for long minutes of how he should design a loop for doing this task. Of course writing such a loop is not a highly difficult task, but why would we waste our time, if we can get the same result in a single line of R code?

### Progress bars in R (part II) – a wrapper for apply functions

11Jan10

In a previous post I gave some examples of how to make a progress bar in R. In the examples the bars were created within loops. Very often though I have situations where I would like have a progress bar when using apply(). The plyr package provides several apply-like functions also including progress bars, so one could have a look here and use a plyr function instead of apply if possible. Anyway, here comes a wrapper for apply, lapply and sapply that has a progressbar. It seems to work although one known issue is the use of vectors (like c(1,2)with the MARGIN argument in apply_pb(). Also you can see in the performance comparison below that the wrapper causes overhead to a considerable extent, which is the main drawback of this approach. Continue reading ‘Progress bars in R (part II) – a wrapper for apply functions’

### Infomaps using R – Visualizing German unemployment rates by district on a map

16Nov09

Lately, David Smith from REvolution Computing set out to challenge the R community with the reprocuction of a beautiful choropleth map (= multiple regions map/thematic map) on US unemployment rates he had seen on the Flowing Data blog. Here you can find the impressing results. Being a fan of beautiful visualizations I tried to produce a similar map for Germany.

1. Getting the spatial country data

The first step resulted in getting data to draw a map of the German administrative districts. Unfortunately, the maps for Germany do not come along in the map package, which would mean I could easily adopt the code results from the challenge. Getting data: The GADM database of Global Administrative Areas has the aim to provide data of administrative districts for the whole world on different levels (country, state and county level). The data can be downloaded as as a shapefile, an ESRI geodatabase file, a Google Earth .kmz file and very convenient for R users, as an Rdata file.

2. Getting socio-demographic data (e. g. unemployment rates by administrative district): A lot of data is available online at www.statistikportal.de. On this site you find links to several data bases. To get the unemployment stats by county I clicked my way through: Regionaldatenbank Deutschland -> Arbeitsmarkt -> Arbeitsmarktstatistik der Bundesagentur für Arbeit -> Arbeitslose nach ausgewählten Personengruppen sowie Arbeitslosenquoten – Jahresdurchschnitt – (ab 2008) regionale Tiefe: Kreise und krfr. Städte -> Werteabruf -> save as CSV format. This table contains all the information I need, although for some reson, for a few districts there is no data listed. I also looked for another source. On Regionalatlas a nice online visualization tool is offered. In the menu I selected unemployment rate 2008 as indicator. Besides the nice visualization you get, there is a menu button “tables” where you can retrieve a html table of the data. I simply copied and pasted it into a .txt file which gives me a tab seperated value format I can read in R. But still: some districts are not listed. Here is a pdf file containing the data. Continue reading ‘Infomaps using R – Visualizing German unemployment rates by district on a map’

### R: Function to create tables in LaTex or Lyx to display regression model results

19Jun09

Most people using LaTex feel that creating tables is no fun. Some days ago I stumbled across a neat function written by Paul Johnson that produces LaTex code as well as LaTex code that can be used within Lyx. The output can be used for regression models and looks like output from the Stata outreg command. His R function that produces the LaTex code has the same name:  outreg(). The outreg code can be found on his website or in the PDF copy of the code from his website.

I took the code, put it into a .rnw file and sweaved it. It worked like a charm and produced beautiful results (see the picture on the left and the PDF). Below you can find the code for the noweb file (.rnw). Latex code is colored grey, R-code is colored blue. Just have a look at all the results as a PDF file. Besides, Paul Johnson has also created a nice list of R-Tips that can be found on his website as well.

### Getting started with R (for german speakers)

05Jun09

Just a little note for german speaking R beginners: There is an introductory course in R (german) available online on the website of the department of methodology and evaluation research at the University of Jena. Dr. Ivailo Partchev holds a seven sessions course on that topic (duration 11.5 hours).

### R: Building functions – using default settings that can be modified via the dot-dot-dot / three point argument

07Apr09

Before you read this post, please have a look at Enrique’s comment below. He pointed out that the built-in R function modifyList() already does what I wanted to describe in this post. Well, I live to learn :)

I was wondering how I could write a function that uses default settings but accepts a list to overwrite the default settings via the dot-dot-dot / three-point argument. Here comes my solution.

# building a function with a list of default settings
# that can be modified by an optional list passed
# via the dot-dot-dot / three point argument



### How accurate or reliable are R calculations?

28Mar09

On the REvolutions Blog there is a nice posting treating the often raised concern on “How good or reliable R is”. At my university R is hardly used. Sometimes I was asked by lecturers wether the calculations done by R and its packages are accurate. The linked posting treats this matter and tries to clarify this point.