|
|
Here is a list of all NTFS design issues which have come up that effect the structure, along with current resolution (if there is one) of the issue. The resolution of these issues affects the "NTFS Design Specification 1.1" issued May 29, 1991. This list will be the final qualification to the spec until there is time to update it to a form which reflects the actual implementation. Of course the most precise definition of NTFS will always be in the header file which describes its structure: ntfs.h.
These issues have been collected primarily from our own internal review and the feedback received from MarkZ. They are listed here in no particular order.
Issue 1:
Support for nontagged attributes is a pain in the low-level attribute routines, as well as in Format and ChkDsk. They are of very little value to the File System in terms of space or performance.
Resolution:
Nontagged attributes are being dropped for the purposes of NTFS's own use of attributes to implement the disk structure. Nontagged attributes will be supported with the general table support.
Issue 2:
The EXTERNAL_ATTRIBUTES attribute, should have a better name, and its definition should be changed to simplify various NTFS algorithms.
Resolution:
The attribute name has been changed to the ATTRIBUTE_LIST attribute. It is still only created when a file requires more than one file record segment. At that time it is created to list all attributes (including those in the base file record) by type code and (optional) name. it is ordered by Attribute Type Code and Attribute Name.
One reason for this change is to facilitate the enumeration of all attributes for a file with multiple file record segments. This slightly different definition also gives NTFS's attribute placement policy more freedom to shuffle attributes around within the file record segments.
Issue 3:
Attribute ordering rules within the file, within each file record segment, and within the ATTRIBUTE_LIST were not completely specified.
Resolution:
The only rule for the ordering of attributes within each file, if there are multiple file record segments, is that STANDARD_INFORMATION must be in the base file record segment, and (at least the first part of) the ATTRIBUTE_LIST attribute must also be in the base file record segment. In general, the system should try to keep the other system-defined attributes with the lowest Attribute Type Codes present in the base file record segment when possible, for performance reasons.
Within each file record segment, attributes will be ordered by type code, name, and then value. (If an attribute is not unique in type code and name, then it must be indexed and the value must be referenced.)
The entries of the ATTRIBUTE_LIST will be ordered by attribute code and name.
Reliance on these ordering rules may be used to speed up attribute lookup algorithms.
Issue 4:
NTFS is NOT secure on removeable media without data encryption.
Resolution:
Functionality for the encryption of communications and physical media is already planned for Product 2 of NT, at which time we will decide what the best mechanism will be for integrating this support with removeable NTFS volumes. We must insure now that this can be implemented in a upward-compatible manner.
Issue 5:
It would be very desirable for WINX to have the ability to uniquely identify and open files by a small number.
Resolution:
Logically the ability to use this functionality must be controlled by some privilege, as it is expensive and nearly impossible to come up with a consistent strategy on how to do correct path traversal checking, in a system such as NTFS which supports multiple directory links to a single file. Once the requirement for a special privilege is accepted, it is relatively easy for NTFS to support an API which would allow files to be opened by their (64-bit) File Reference number. The File Reference is perfect for this purpose, as it includes a 16-bit cyclically-reused sequence number to detect the attempt to use a stale File Reference. I.e., the original file with the same 48-bit Base File Record address has been deleted, and a new file has been created at the same address.)
THIS REQUIRES A NEW NT I/O API.
Issue 6:
Enumeration of files in a directory in NT could be very slow, since to get more than just a file's name requires reading (at least) the base file record for the file.
Resolution:
The initial NT-based implementation of NTFS will come up with a strategy for clustering file record segments together in the MFT for files created in the same directory. Current thinking is that this will be done *without* change to the NTFS structure definition. So, for example, the first 128 files in a directory might be contiguous in the MFT, and then the second 128 will also be contiguous, etc. This will allow the implementation to prefetch files up to 128 file record segments at a time with a large spiral read, then expect cache hits during the enumeration.
Secondly, at some point the implementation will cache enumeration information, to make subsequent enumeration of the same directory extremely fast.
Issue 7:
Is it an unnecessary complexity to NTFS to support multiple collating rules, as opposed to a simple byte-comparison collation? Note that frequently the caller collates himself anyway.
Resolution:
This is not resolved yet pending further discussion.
The current reason NTFS plans to support multiple collating rules, is that collating in the caller can have bad performance and response characteristics in large directories. For example, consider a Windows App which requests the enumeration of a directory with 200 files (possibly over the network to a heavily loaded server), and it is going to display this enumeration in a List box with 10 or 20 lines. If it does not have to collate the enumeration, it can start displaying as soon as it receives part of the enumeration. Otherwise it has to wait to get the entire enumeration before it can collate and display anything.
Issue 8:
Should there be a bit in STANDARD_INFORMATION to indicate whether a file record has an INDEX attribute or not?
Resolution:
There is no plan to do this, unless we find additional reasons to do so that we are missing. Currently we see how this bit could speed the rejection of illegal path specifications, but it would not speed the acceptance of correct ones. Note that from the structure of NTFS, it is legal for a file to have both an INDEX attribute *and*, for example, a DATA attribute.
Issue 9:
The algorithms and consistency rules surrounding the 8.3 indices need to be clarified.
Resolution:
This will be done by 7/31.
Issue 10:
Why not eliminate the VERSION attribute and move it to STANDARD_INFORMATION?
Resolution:
We will do this, and then define an additional file attribute and/or field which controls whether or not versioning is enabled and possibly how many versions are allowed for a file.
Issue 11:
There should be a range of system-defined attribute codes which are not allowed to be duplicated, as this will speed up some of the lookup algorithms.
Resolution:
This will be done.
Issue 12:
Is duplication of the log file the correct way to add redundancy to NTFS to allow mounting in the event of read errors.
Resolution:
Upon further analysis, it was determined that the needed redundancy was incorrectly placed. It is more important to duplicate the first few entries of the MFT, than to duplicate the start of the log file. This change will be made.
Issue 13:
The spec describes how access to individual attribute types may be controlled by special ACEs, which is incompatible with the current NT APIs and our security strategy.
Resolution:
This will be fixed. Access to user-defined attributes will be controlled by the READ_ATTRIBUTES and WRITE_ATTRIBUTES access rights.
Issue 14:
A file attribute should be added which supports more efficient handling of temporary files.
Resolution:
An attribute will be added for files, and possibly directories, which will enable NTFS to communicate "temporary file" handling to the Cache Manager. Temporary files will never be set dirty in the Cache Manager or written to disk by the Lazy Writer, although the File Record will be correctly updated to keep the volume consistant. If a temporary file is deleted, then all writes to its data are eliminated. If MM discovers that memory is getting tight, it may choose to flush data to temporary files, so that it can free the pages. In this case the correct data for the file will eventually be faulted back in.
This makes the performance of I/O to temporary files approach the performance of putting them on a RAM disk. An advantage over RAM disk, though, is that no one has to specify how much space should be used for this purpose.
Issue 15:
It would be nice to have some flag in each file record segment to say if it is in use or not. This would simplify chkdsk algorithms, although it would require the record to be written on deletion.
Resolution:
This will be done. It is difficult to suppress the write of the file record on deletion anyway.
|