How To Find Link Targets

Write tutorials and howtos in here
There are more tutorials here http://community.linuxmint.com/tutorial/welcome
Forum rules
Do not start a support topic here please. Before you post please read this

How To Find Link Targets

Postby Jesse654 on Sat Jan 21, 2012 1:51 pm

The more I use Linux, the more I realize how important and ubiquitous symbolic links are. I would use type and which and then find out that the actual file I was looking for was the target of a link. So I created an awk program to track down a link, no matter how many levels it has to go through. It will list each link, then the target, as a "ls -ld" listing.

The Script
Code: Select all
#  tracklink.awk - SJB - T1111.22 - F1201.20
#  given a file, see if it is a link and, if so, track it until the final
#  target file is found; handles relative links and resolves them so they
#  don't include /dir/../ notation

#  Usage:
#  awk -f tracklink.awk linkfile ...

BEGIN {
   if ( ARGC < 2 )
   {
      ErrOut("Usage:")
      ErrOut("  awk -f tracklink.awk linkfile ...")
      exit 1
   }
   argc = 1
   while ( argc < ARGC )
   {
      if ( argc > 1 )                  #  print a blank line between
         printf "\n"                  #   trackings

      linkfile = ARGV[argc]            #  potential link file
      ARGV[argc++] = ""               #  take file off command line list

      delete listl                  #  reset; "listl" checks for
      listlnum = 0                  #   circular links
      path = ""

      listl[linkfile] = ++listlnum      #  add to list
      lsl = GetLsl(linkfile)            #  get   "ls -ld" listing
      PrintLsl(lsl, linkfile)            #  print "ls -ld" listing

      while ( substr(lsl, 1, 1) == "l" )   #  while link file
      {
         numA = split(lsl, A)
         newlinkfile = A[numA]         #  get last field in listing

         #  if newlinkfile does not begin with "/" then it is
         #   relative, so add path from previous linkfile
         if ( substr(newlinkfile, 1, 1) != "/" )
         {
            ndx = PosLastOccurCh("/", linkfile)
            newpath = substr(linkfile, 1, ndx)
            if ( newpath )
               path = newpath
            newlinkfile = path newlinkfile
            gsub("/[^/]+/[.][.]/", "/", newlinkfile)
         }                        #  replace /dir/../ with /

         linkfile = newlinkfile
         if ( linkfile in listl )
         {
            ErrOut("tracklink.awk:  circular links")
            break
         }
         listl[linkfile] = ++listlnum   #  same add,
         lsl = GetLsl(linkfile)         #   get and
         PrintLsl(lsl, linkfile)         #   print as above
      }
   }
}

#  returns "ls -ld" listing; -d option needed in case target is a directory
function GetLsl(linkfile,      cmd, lsl)
{
   cmd = "ls -ld " linkfile " 2> /dev/null"
   cmd | getline lsl
   close(cmd)
   return lsl
}

#  prints the "ls -ld" listing or, if empty, prints error msg with linkfile name
function PrintLsl(lsl, linkfile)
{
   if ( lsl )
      print lsl
   else
      ErrOut(linkfile ":  no listing")
}

#  returns the position of the last occurrence of char in string or zero if char
#  is not present
function PosLastOccurCh(char, string,      ndx, finalndx)
{
   while ( ndx = index(string, char) )
   {
      finalndx += ndx
      string = substr(string, ndx+1)
   }
   return finalndx
}

#  prints the given string to stderr
function ErrOut(s) { print s > "/dev/stderr" }


Running the Script
You can run it like this:
awk -f tracklink.awk /path/linkfile
It works fine with both gawk and mawk. Because it uses a call to a system command, there is no guarantee that this will work on a non-POSIX system (like Windows without cygwin).

Making it Easy
Place tracklink.awk into a directory called ~/awk (create ~/awk if it does not exist). Then add the following three functions to ~/.bashrc (create ~/.bashrc if it does not exist):
Code: Select all
#  tracks given file(s) if it is a link; must give its full path
function tracklink
{
   /usr/bin/awk -f ~/awk/tracklink.awk $@
}

#  resolves a single link (or any file) to its final target
function resolvelink
{
   /usr/bin/awk -f ~/awk/tracklink.awk $1 | /usr/bin/awk 'END{print A[split($0,A)]}'
}

#  prints "type $1" and "which $1" and tracks $1 if it is a link
function tell
{
   type $1
   /usr/bin/which $1  &&  /usr/bin/awk -f ~/awk/tracklink.awk `/usr/bin/which $1`
}


Then, from the command line, run this command to reload .bashrc:
Code: Select all
source ~/.bashrc


Understanding It All
Since the awk file is well documented, I won't explain it line by line. I will mention that circular links are handled and other broken links--those without targets--give a "no listing" message.

The tracklink() function will run on multiple files; the resolvelink() and tell() functions run on only one file at a time.

The tracklink() function in .bashrc allows you to run:
Code: Select all
tracklink /path/linkfile

With the filename, this function needs either a relative path, or a fully qualified path. Given a non-link file (including a directory name), it will simply print a "ls- ld" listing of it.

To print out the final target of any link, if any, try:
Code: Select all
resolvelink /path/linkfile

On a normal file, resolvelink will simply print that filename. For links it will print the filename of the final target. If a fully qualified path is given, then a fully qualified path is returned with the link; if a relative path (or no path) is given, then a relative path (or no path) is returned with the link. On error, it will give an error message and print the final file it found, if any. Given a non-link file (including a directory name), it will simply print the filename given.

To print out more info, consider the tell() function, meaning "tell me info on this file":
Code: Select all
tell linkfile

The function tell() is where this started for me. This function will print out the result of type (a bash shell builtin) and which, and, if which succeeds, it gives tracklink.awk the results of which and prints the link info. Since which is used, the file given needs to be in the system PATH, but no path needs to be given with the filename.

Testing It
Try entering this in a terminal:
Code: Select all
awk -f ~/awk/tracklink.awk $(ls -l /usr/bin/* | grep xul | awk '/^l/{print $8}')

On most Linux Mint systems, this will print three trackings. If you get a few "no listing" errors, try changing $8 to $9 (or $7).

I need to warn that this next test will normally produce a lot of output. That is why I output it to the test.out file. On my system, test.out is 47k and 879 lines long. (Those are just some of the links that exist on your system.) Again, you may need to change $8 to $9 (or $7).
Code: Select all
ls -l /usr/bin/* | awk '/^l/{print $8}' | xargs awk -f ~/awk/tracklink.awk > test.out

You can open test.out in your favorite editor or use less:
Code: Select all
less test.out

less uses PageUp/PageDown/Home/End/UpArrow/DownArrow, among others, for navigation. To quit, press q.

Have fun!
Last edited by Jesse654 on Sun Jan 22, 2012 2:28 pm, edited 1 time in total.
Jesse654
Level 2
Level 2
 
Posts: 96
Joined: Thu Sep 02, 2010 8:04 pm

Linux Mint is funded by ads and donations.
 

Re: How To Find Link Targets

Postby xenopeek on Sat Jan 21, 2012 2:43 pm

This is what the readlink command is for :wink: (And I'm still too rusty on AWK... :D) The following command will show what file the given filename ultimately points at (it will follow symbolic links recursively). See the manpage for how you can use it also to find problems with symbolic links.
Code: Select all
readlink -f filename

For example, assume you have the following files in a directory:
Code: Select all
-rw-r--r--  1 vincent vincent     0 2012-01-21 19:35 a
lrwxrwxrwx  1 vincent vincent     1 2012-01-21 19:35 b -> a
lrwxrwxrwx  1 vincent vincent     1 2012-01-21 19:36 c -> b

Then "readlink -f c", outputs "a".

Bonus: to find all links to a file (replace dirname with the directory to search, replace filename with the file to search for):
Code: Select all
find -L dirname -samefile filename
User avatar
xenopeek
Level 21
Level 21
 
Posts: 14557
Joined: Wed Jul 06, 2011 3:58 am
Location: The Netherlands

Re: How To Find Link Targets

Postby Jesse654 on Sun Jan 22, 2012 2:09 pm

Vincent, my sincere thanks. You are, again, a fountain of information. :)

Re readlink, previously I did try it without the -f option and was disappointed that it found only one level of link. I looked at the info page and the "`Canonicalize mode' `readlink' outputs the absolute name of the given file" explanation made no sense to me. (And don't get me started on the word "canon," the modern uses of which--or some may say modern additional definitions--I find quite...silly, among other adjectives.) The man page is just little better, IMO. After your post, Vincent, I tried it with the -f option. This seems to be analogous to my resolvelink.

More analysis--given:
Code: Select all
lrwxrwxrwx 1 blu blu     9 2011-12-15 17:46 link1.txt -> link2.txt
lrwxrwxrwx 1 blu blu     9 2011-12-15 17:46 link2.txt -> link3.txt
lrwxrwxrwx 1 blu blu     9 2011-12-15 17:46 link3.txt -> link4.txt
lrwxrwxrwx 1 blu blu     9 2011-12-15 17:46 link4.txt -> link5.txt
lrwxrwxrwx 1 blu blu    10 2011-12-15 17:45 link5.txt -> target.txt
-rw------- 1 blu blu    27 2011-12-15 18:07 target.txt
lrwxrwxrwx 1 blu blu    13 2011-12-15 18:10 link1circ.txt -> link2circ.txt
lrwxrwxrwx 1 blu blu    13 2011-12-15 18:10 link2circ.txt -> link3circ.txt
lrwxrwxrwx 1 blu blu    13 2011-12-15 18:10 link3circ.txt -> link4circ.txt
lrwxrwxrwx 1 blu blu    13 2011-12-15 18:10 link4circ.txt -> link1circ.txt
lrwxrwxrwx 1 blu blu    12 2011-12-15 18:01 linkself.txt -> linkself.txt
lrwxrwxrwx 1 blu blu     5 2011-12-15 17:43 linkx.txt -> x.txt


$ readlink link1circ.txt
gives
link2circ.txt
which isn't much help and
$ echo $?
gives
0
which belies the circular link aspect.

$ readlink -f link1circ.txt
gives no output and
$ echo $?
gives
1
which is some help, perhaps in a bash script.

$ readlink -f x.txt
gives
/home/.../awk/test/x.txt
[ellipsis added] which is either a bug or a very poor design, since:
$ ls x.txt
gives
ls: cannot access x.txt: No such file or directory

I can confirm that the file "x.txt" does not exist on my machine in that directory.

I tried other options:
It seems that only:
$ readlink -ev x.txt
or
$ readlink -v x.txt
will give a proper error message:
readlink: x.txt: No such file or directory

Findings:
tracklink.awk, tracklink(), resolvelink(), tell() are useful, easily modifiable and extensible, and have decent documentation. readlink pales in comparison. (Can you tell I'm not a readlink fan? :) )
I'll leave conclusions to individual users.

If the reader is interested in duplicating the links (and one text file) listed above, entering these in a terminal should do it:
Code: Select all
echo "some semi-random text here" > target.txt
ln -s link2.txt link1.txt
ln -s link3.txt link2.txt
ln -s link4.txt link3.txt
ln -s link5.txt link4.txt
ln -s target.txt link5.txt
ln -s link2circ.txt link1circ.txt
ln -s link3circ.txt link2circ.txt
ln -s link4circ.txt link3circ.txt
ln -s link1circ.txt link4circ.txt
ln -s linkself.txt linkself.txt
ln -s x.txt linkx.txt


find--yes, I have not yet tamed the "find" beast. I know it can be quite useful and has a ton of options. I need to go through a bunch of options and find a good tutorial on the subject. In trying:
$ find -L . -samefile link1.txt
I found a lot of extraneous output not pertaining to the link1.txt chain. This helped:
$ find -L . -samefile link1.txt 2> /dev/null
Jesse654
Level 2
Level 2
 
Posts: 96
Joined: Thu Sep 02, 2010 8:04 pm

Re: How To Find Link Targets

Postby xenopeek on Sun Jan 22, 2012 3:26 pm

If you need to debug why a symbolic link is not working, after careful review, I'm thinking some scripting would indeed be needed :mrgreen:

I think you meant to run the commands with linkx.txt as argument? You did it with x.txt, and that is a non-existent file.
Code: Select all
$ readlink -fv linkx.txt
/home/vincent/t/x.txt
$ readlink -ev linkx.txt
readlink: linkx.txt: No such file or directory

Second command tells you the link-target does not exist. It does tell you which symbolic link fails, but not what the target is (or if you already passed through other symbolic links to get here, what they where).

And regarding the circular symbolic links, okay, so it can't handle endless recursion 8)
Code: Select all
$ readlink -fv link1circ.txt
readlink: link1circ.txt: Too many levels of symbolic links

BTW, Nautilus crashes if I try to browse the directory where I have your example files :? Methinks you found a bug in Nautilus :wink:

Perhaps a useful feature missing from this, showing the permissions on the containing directory of a symbolic link. The permissions on the containing directory determine whether you can create, change, delete and follow a symbolic link in it. The permissions on the normal file linked to determine if you can execute, read and write to that file. (The permissions show by ls for a symbolic link are meaningless, except for the l at the start :wink:)
User avatar
xenopeek
Level 21
Level 21
 
Posts: 14557
Joined: Wed Jul 06, 2011 3:58 am
Location: The Netherlands

Re: How To Find Link Targets

Postby Jesse654 on Sun Jan 22, 2012 5:12 pm

Actually I was testing how "readlink -f" works with a non-existent file. It prints the non-existent file with a fully qualified pathname! And it returns a zero (success) code. My goodness! Same with "readlink -fv"

Vincent Vermeulen wrote:BTW, Nautilus crashes if I try to browse the directory where I have your example files :? Methinks you found a bug in Nautilus :wink:

I did. :lol: See:
https://bugzilla.gnome.org/show_bug.cgi?id=666418
Since you brought this up (not me), I'll say, I don't know how the File Browser (one of the main components) of the most popular DE around can have such a serious and obvious error. If they were programming to show links, then I would think that broken links (circular and no-target links) would be on the top of the list to test. Otherwise...well, I'll stop this rant now.

Anyway, with tracklink.awk, I was just trying to find as much info as I could about various links on the system. Some production links on the system (meaning not contrived by me) actually involve 4 files--three links and a target.

Here are some examples I find interesting:
Code: Select all
lrwxrwxrwx 1 root root 21 2010-08-30 01:30 /usr/bin/c++ -> /etc/alternatives/c++
lrwxrwxrwx 1 root root 12 2010-08-30 01:30 /etc/alternatives/c++ -> /usr/bin/g++
lrwxrwxrwx 1 root root 7 2011-04-15 12:08 /usr/bin/g++ -> g++-4.4
-rwxr-xr-x 1 root root 220428 2010-03-26 18:43 /usr/bin/g++-4.4

lrwxrwxrwx 1 root root 27 2011-11-10 14:53 /usr/bin/xulrunner -> /etc/alternatives/xulrunner
lrwxrwxrwx 1 root root 24 2011-11-10 14:53 /etc/alternatives/xulrunner -> /usr/bin/xulrunner-1.9.2
lrwxrwxrwx 1 root root 35 2011-11-10 14:53 /usr/bin/xulrunner-1.9.2 -> ../lib/xulrunner-1.9.2.24/xulrunner
-rwxr-xr-x 1 root root 3917 2011-11-08 04:19 /usr/lib/xulrunner-1.9.2.24/xulrunner

lrwxrwxrwx 1 root root 21 2010-08-28 00:04 /usr/bin/awk -> /etc/alternatives/awk
lrwxrwxrwx 1 root root 13 2010-08-28 00:04 /etc/alternatives/awk -> /usr/bin/gawk
-rwxr-xr-x 1 root root 317880 2009-12-10 18:24 /usr/bin/gawk


Typing stuff like:
Code: Select all
$ which xulrunner
/usr/bin/xulrunner
$ ls -l `which xulrunner`
lrwxrwxrwx 1 root root 27 2011-11-10 14:53 /usr/bin/xulrunner -> /etc/alternatives/xulrunner
$ ls -l /etc/alternatives/xulrunner
lrwxrwxrwx 1 root root 24 2011-11-10 14:53 /etc/alternatives/xulrunner -> /usr/bin/xulrunner-1.9.2
$ ls -l /usr/bin/xulrunner-1.9.2
lrwxrwxrwx 1 root root 35 2011-11-10 14:53 /usr/bin/xulrunner-1.9.2 -> ../lib/xulrunner-1.9.2.24/xulrunner
$ ls -l ../lib/xulrunner-1.9.2.24/xulrunner
ls: cannot access ../lib/xulrunner-1.9.2.24/xulrunner: No such file or directory
$ ls -l /usr/bin/../lib/xulrunner-1.9.2.24/xulrunner
-rwxr-xr-x 1 root root 3917 2011-11-08 04:19 /usr/bin/../lib/xulrunner-1.9.2.24/xulrunner
got tiring.

You know where it started? With awk itself. My LM9-gnome defaults to gawk, and my LM9-LXDE defaults to mawk. When I found that out, I asked myself, What is an easy way to find that out? tracklink.awk was the answer.

Here:
Code: Select all
lrwxrwxrwx 1 root root 1 2010-08-28 00:04 /usr/bin/X11 -> .
drwxr-xr-x 2 root root 65536 2011-12-12 10:36 /usr/bin/.
I didn't even know links to directories were possible before seeing this. (Or at least I don't remember seeing one.)

Hmm, that leads me to this:
Code: Select all
$ readlink -f /usr/bin/X11
/usr/bin
$ resolvelink /usr/bin/X11
/usr/bin/.
resolvelink should probably get rid of the last '.' Well, that's a fix for another time--should be simple.
Jesse654
Level 2
Level 2
 
Posts: 96
Joined: Thu Sep 02, 2010 8:04 pm

Re: How To Find Link Targets

Postby Jesse654 on Mon Jan 23, 2012 3:08 pm

Vincent Vermeulen wrote:Perhaps a useful feature missing from this, showing the permissions on the containing directory of a symbolic link.

Ok, amigo, here is tracklink3.awk which:
- adds option to print the Containing DIRectory of the symbolic link (-v CDIR=1)
- removes '.' at end of directory listing, if any, so any dir will (should) end with a '/'

Usage:
Code: Select all
awk -v CDIR=1 -f tracklink3.awk linkfile ...


tracklink3.awk:
Code: Select all
#  tracklink3.awk - SJB - T1111.22 - F1201.20, M1201.23
#  given a file, see if it is a link and, if so, track it until the final
#  target file is found; handles relative links and resolves them so they
#  don't include /dir/../ notation;
#  removes '.' at end of directory listing, if any, so any dir will (should) end
#  with a '/';
#  adds option to print the Containing DIRectory of the symbolic link (-v CDIR=1)

#  Usage:
#  awk [-v CDIR=1] -f tracklink3.awk linkfile ...

BEGIN {
   if ( ARGC < 2 )
   {
      ErrOut("Usage:")
      ErrOut("  awk [-v CDIR=1] -f tracklink3.awk linkfile ...")
      exit 1
   }
   argc = 1
   while ( argc < ARGC )
   {
      if ( argc > 1 )                  #  print a blank line between
         printf "\n"                  #   trackings

      linkfile = ARGV[argc]            #  potential link file
      ARGV[argc++] = ""               #  take file off command line list

      delete listl                  #  reset; "listl" checks for
      listlnum = 0                  #   circular links
      path = ""

      if ( CDIR == 1 )   #   then print the Containing DIRectory of the link
      {
         cdirndx = PosLastOccurCh("/", linkfile)
         if ( cdirndx )
            cdir = substr(linkfile, 1, cdirndx)
         else
            cdir = "."               #  current directory
         lsl = GetLsl(cdir)            #  get   "ls -ld" listing of cdir
         PrintLsl(lsl, cdir)            #  print "ls -ld" listing of cdir
      }

      listl[linkfile] = ++listlnum      #  add to list
      lsl = GetLsl(linkfile)            #  get   "ls -ld" listing
      PrintLsl(lsl, linkfile)            #  print "ls -ld" listing

      while ( substr(lsl, 1, 1) == "l" )   #  while link file
      {
         numA = split(lsl, A)
         newlinkfile = A[numA]         #  get last field in listing

         #  if newlinkfile does not begin with "/" then it is
         #   relative, so add path from previous linkfile
         if ( substr(newlinkfile, 1, 1) != "/" )
         {
            ndx = PosLastOccurCh("/", linkfile)
            newpath = substr(linkfile, 1, ndx)
            if ( newpath )
               path = newpath
            newlinkfile = path newlinkfile
            gsub("/[^/]+/[.][.]/", "/", newlinkfile)
         }                        #  replace /dir/../ with /
         #gsub("/[.]/", "/", newlinkfile)   #  replace /./ with /
         sub( "/[.]$", "/", newlinkfile)   #  replace /. at end with /

         linkfile = newlinkfile
         if ( linkfile in listl )
         {
            ErrOut("tracklink.awk:  circular links")
            break
         }
         listl[linkfile] = ++listlnum   #  same add,
         lsl = GetLsl(linkfile)         #   get and
         PrintLsl(lsl, linkfile)         #   print as above
      }
   }
}

#  returns "ls -ld" listing; -d option needed in case target is a directory
function GetLsl(linkfile,      cmd, lsl)
{
   cmd = "ls -ld " linkfile " 2> /dev/null"
   cmd | getline lsl
   close(cmd)
   return lsl
}

#  prints the "ls -ld" listing or, if empty, prints error msg with linkfile name
function PrintLsl(lsl, linkfile)
{
   if ( lsl )
      print lsl
   else
      ErrOut(linkfile ":  no listing")
}

#  returns the position of the last occurrence of char in string or zero if char
#  is not present
function PosLastOccurCh(char, string,      ndx, finalndx)
{
   while ( ndx = index(string, char) )
   {
      finalndx += ndx
      string = substr(string, ndx+1)
   }
   return finalndx
}

#  prints the given string to stderr
function ErrOut(s) { print s > "/dev/stderr" }


So, this:
Code: Select all
awk -v CDIR=1 -f tracklink3.awk /usr/bin/awk
will give this:
Code: Select all
drwxr-xr-x 2 root root 69632 2012-01-22 13:39 /usr/bin/
lrwxrwxrwx 1 root root 21 2010-08-28 00:04 /usr/bin/awk -> /etc/alternatives/awk
lrwxrwxrwx 1 root root 13 2010-08-28 00:04 /etc/alternatives/awk -> /usr/bin/gawk
-rwxr-xr-x 1 root root 317880 2009-12-10 18:24 /usr/bin/gawk


If you feel the need to replace /./ with / then there is a gsub() you can uncomment. I've been trying to think of a case where this might happen but, so far, cannot think of one.

Enjoy.
Jesse654
Level 2
Level 2
 
Posts: 96
Joined: Thu Sep 02, 2010 8:04 pm

Re: How To Find Link Targets

Postby xenopeek on Mon Jan 23, 2012 3:28 pm

Cool :D This will come in handy. Perhaps I should dust off my old AWK manual :wink:
User avatar
xenopeek
Level 21
Level 21
 
Posts: 14557
Joined: Wed Jul 06, 2011 3:58 am
Location: The Netherlands

Re: How To Find Link Targets

Postby Jesse654 on Mon Jan 23, 2012 3:55 pm

Vincent Vermeulen wrote:Perhaps I should dust off my old AWK manual :wink:
Go for it! gawk can even be used for network programming:
http://www.gnu.org/software/gawk/manual/html_node/TCP_002fIP-Networking.html#TCP_002fIP-Networking :)
Jesse654
Level 2
Level 2
 
Posts: 96
Joined: Thu Sep 02, 2010 8:04 pm

Linux Mint is funded by ads and donations.
 

Return to Tutorials / Howtos

Who is online

Users browsing this forum: No registered users and 8 guests