Find Invalid Links in Excel That Cannot Be Broken

September 18th 2016 Microsoft Office

If an Excel file contains links to other workbooks, Excel will offer to update the data every time you open the file:

Workbook contains links to external sources

If the linked workbook doesn't exist, it will provide an option to edit the invalid link:

Edit the links you think are wrong

If the link was added by mistake, you might want to remove it. There is an option to do that in the dialog that opens - Break Link:

Break invalid link

Unfortunately, the command silently fails in certain scenarios, i.e. the link remains present after the command is executed. Supposedly, this can happen when the link is used in named ranges or in data validation. To efficiently check the names, Name Manager can be used. However, the only way to check data validations is to open every one of them and check the source specified:

Check the Source of each Data Validation definition

Although, in my case I knew which cells had data validation defined, there where too many of them to manually find the problematic one by checking them one by one. Then it occurred to me that the Excel workbook format is actually compressed XML which can be searched through with any descent text editor.

I made a copy of the .xlsx file and changed its extension to .zip. I then decompressed its contents into a folder and used Visual Studio Code's Find in Files functionality to find the worksheet(s) containing the link to the missing file (the filename can be found in Excel's Edit Links dialog). It is enough to only search the XML files.

In my workbook, the only match was found in xl\worksheets\sheet15.xml:

<x14:dataValidation type="list" allowBlank="1" showInputMessage="1">
  <x14:formula1>
    <xm:f>'C:\Users\Damir\Documents\[MissingFile.xlsx]MySheet'!#REF!</xm:f>
  </x14:formula1>
  <xm:sqref>E13:E37</xm:sqref>
</x14:dataValidation>

Now, I needed to match this file to a worksheet in Excel. I first checked xl\workbook.xml.rels and found the following matching entry:

<Relationship Id="rId15" 
  Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/worksheet"
  Target="worksheets/sheet15.xml"/>

Target attribute contains my filename of my match. I need the value of Id attribute to find the actual name of the worksheet in Excel. In xl\workbook.xml there is an entry with it:

<sheet name="ActualName" sheetId="42" r:id="rId15"/>

I found it based on r:id attribute value. The name attribute contains the worksheet name as displayed in Excel. Now I could finally reopen the workbook in Excel, navigate to ActualName worksheet, and open the Data Validation dialog for the E13:E37 range of cells (as specified in the snippet from xl\worksheets\sheet15.xml above). After I changed the Source there, the Break Link command in Edit Links dialog succeeded as well - no more nagging dialogs every time I open the file.

Get notified when a new blog post is published (usually every Friday):

If you're looking for online one-on-one mentorship on a related topic, you can find me on Codementor.
If you need a team of experienced software engineers to help you with a project, contact us at Razum.
Copyright
Creative Commons License