Remove password protection from Excel sheets using R

06May18

Most data scientists wished that all data lived neatly managed in some DB. However, in reality, Excel files are ubiquitous and often a common way to disseminate results or data within many companies. Every now and then I found myself in the situation where I wanted to protect Excel sheets against users accidentally changing them. A few months later, however, I found that I sometimes had forgotten the password I used. The “good” thing is that protecting Excel sheets by password is far from safe and access can be recovered quite easily. The following works for .xlsx files only (tested with Excel 2016 files), not the older .xls files.

Before implementing the steps in R, I will outline how to remove the password protection “by hand”. The R way is simply the automation of these steps. The first thing one needs to understand is that a .xlsx file is just a collection of folders and files in a zip container. If you unzip a .xlsx file (e.g. using 7-Zip) you get the following folder structure (sorry, German UI):

In the folder ./xl/worksheets we find one XML file for each Excel sheet. The sheet’s password protection is encoded directly in the sheet. While there used to be the plain password text in the XML in former versions, now, we find the hashed password (see part marked in yellow below). In order to get rid of the password protection, we simply can remove the whole sheetProtection node from the XML. We can do that in any text editor and save the file.

As the last step, we need to recreate the .xlsx file by creating a zip folder that contains our modified XML file (German UI again).

Finally, we change the file extension from .zip back to .xlsx and voilá, we get an Excel file without password protected sheets. Programming the steps outlined above in R is quite straightforward. The steps are commented in the GitHub gist below.

Advertisements


8 Responses to “Remove password protection from Excel sheets using R”

  1. Hi! Cool article. Is there any similar way to remove password from the entire workbook as well ?
    Awaiting your reply
    Regards

  2. 2 markheckmann

    Hi Yashodhan,
    frankly, I do not know. I have not had the problem yet :)
    Bests
    Mark

  3. Thank you, very interesting. It works, I get a warning though: “Warning message:
    In readLines(ws, encoding = “windows1″) :
    incomplete final line found on ‘_tmp/xl/worksheets/sheet1.xml'”

    Any ideas?

    best
    h

  4. 4 markheckmann

    Hi Holger,
    it is a common warning when using readLines. The reason is that R does not find an end-of-line (EOL) character in the file. R warns you that the file may thus be incomplete. That’s all and not the case here. You can read more about this common warning in many places, e.g. here https://stackoverflow.com/questions/5990654/incomplete-final-line-warning-when-trying-to-read-a-csv-file-into-r or https://stackoverflow.com/questions/47357125/incomplete-final-line-found-error-when-reading-url-using-readlines-in-r?rq=1
    Bests
    Mark

  5. after I unzipped the xlsx file I found these folder / files “[6]DataSpaces”, “EncryptedPackage”, “EncryptionInfo”. No xml file to be found. I guess Microsoft improved their encryption after this post.

  6. 6 markheckmann

    Hi Paul.
    Haha. That is because you have an a password protected workbook, not sheets. With a protected workbook, I don’t think there is no way easy way to remove a password. It is solid encryption, I suppose.
    Bests
    Mark


  1. 1 Remove password protection from Excel sheets using R | R-bloggers
  2. 2 Distilled News | Data Analytics & R

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s


%d bloggers like this: