Directory implementation
Directory implementation
The selection of directory-allocation and directory-management
algorithms
significantly affects the efficiency, performance, and
reliability of
choosing one of these algorithms.
Linear List
The simplest method of implementing a directory is to use a linear list of file
names with pointers to the data blocks. This method is simple to program but
time-consuming to execute. To create a new file., we must first search the
directory to be sure that no existing file has the same name. Then, we add a
new entry at the end of the directory. To delete a file, we search the directory
for the named file, then release the space allocated to it.
To reuse the directory entry, we can do one of several things. We can mark the
entry as unused (by assigning it a special name, such as an all-blank name, or
with a used-unused, bit in each entry), or we can attach it to a list of free
directory entries. A third alternative is to copy the last entry in the
directory into the freed location and to decrease the length of the directory.
A linked list can also be used to decrease the time required to delete a file.
The real disadvantage of a linear list of directory entries is that finding a
file requires a linear search. Directory information is used frequently, and
users will notice if access to it is slow. In fact, many operating systems implement
a software cache to store the most recently used directory information. A cache
hit avoids the need to constantly reread the information from disk. A sorted
list allows a binary search and decreases the average search time.
Hash Table
Another data structure used for a file directory is a hash table. With this
method, a linear list stores the directory entries, but a hash data structure
is also used. The hash table takes a value computed from the file name and
returns a pointer to the file name in the linear list. Therefore, it can greatly
decrease the directory search time. Insertion and deletion are also fairly
straightforward, although some provision must be made for collisions—situations
in which two file names hash to the same location.
The major difficulties with a hash table are its generally fixed size and the
dependence of the hash function on that size. For example, assume that we make
a linear-probing hash table that holds 64 entries. The hash function converts
file names into integers from 0 to 63, for instance, by using the remainder of
a division by 64. If we later try to create a 65th file, we must enlarge the
directory hash table—say, to 128 entries. As a result, we need a new hash
function that must map file names to the range 0 to 127, and we must reorganize
the existing directory entries to reflect their new hash-function values.
Alternatively, a chained-overflow hash table can be used. Each hash entry can
be a linked list instead of an individual value, and we can resolve collisions
by adding the new entry to the linked list. Lookups may be somewhat slowed,
because searching for a name might require stepping through a linked list of
colliding table entries. Still, this method is likely to be much faster than a
linear search through the entire directory
https://youtu.be/MfhjkfocRR0
https://youtu.be/MfhjkfocRR0
Comments
Post a Comment