As I was doing some sorting of my collection of CDs, I found out that processing the output of a find with sort -u was giving me a smaller list than plain sort, so I investigated. Using diff on the outputs, I found out the name of the repeated file, and I went to actually check how the same file can exist twice. Indeed, the file that was being repeated, 32, exists twice in fonts/basic. The filesystem is ISO9660:
root@user-desktop:/mnt/loop/fonts/basic# ls -al
total 22
dr-xr-xr-x 1 root root 2048 1978-07-01 00:47 .
dr-xr-xr-x 1 root root 12288 1978-07-01 00:48 ..
-r-xr-xr-x 1 root root 3884 1978-07-01 00:47 32
-r-xr-xr-x 1 root root 3884 1978-07-01 00:47 32
root@user-desktop:/mnt/loop/fonts/basic# ls -l -i
total 8
203394 -r-xr-xr-x 1 root root 3884 1978-07-01 00:47 32
203394 -r-xr-xr-x 1 root root 3884 1978-07-01 00:47 32OK, so they are hardlinked because they have the same inode, but how can they have the same name? And why do they report only having one hardlink?
root@user-desktop:/mnt/loop/fonts/basic# mount|grep mnt/loop
/dev/loop0 on /mnt/loop type iso9660 (ro)Can anybody think of an explanation? Is it something about the ISO9660 filesystem?
1 Answer
Let's make a plain ISO9660 filesystem:
mkdir cd
echo 'hello world' > cd/foo
echo '42' > cd/bar
genisoimage -o cd.iso cdMount, examine, and unmount it:
gnome-disk-image-mounter cd.iso
ls -li /media/user/CDROM 1474 -r-------- 1 user user 3 Jul 18 19:38 bar 1475 -r-------- 1 user user 12 Jul 18 19:38 foo
umount cd.isoNow open the image in a hex editor, and replace FOO.;1 with BAR.;1. If it's helpful, the inode numbers on my system are really the offset of the directory entry into the ISO image / 32, i.e. look at around python -c 'print hex(inode*32)'.
ISO9660, like FAT, doesn't have inodes, but Linux pretends it does. All information is stored directly in the directory, and as each variable length entry is at a minimum more than 32 bytes, this guarantees all 'inodes' are unique.
Now remount it and look again:
gnome-disk-image-mounter cd.iso
ls -li /media/user/CDROM 1474 -r-------- 1 user user 3 Jul 18 19:38 bar 1474 -r-------- 1 user user 3 Jul 18 19:38 bar
umount cd.isoNotice the 'inode' numbers and file sizes. Both files are still there in the image, but the duplicate filename confuses Linux, causing it to list the first file twice. The second file is completely inaccessible now anyway.