Doc. no. WG21/N1841=05-0101
Date: 2005-08-23
Project: Programming Language C++
Reply to: Beman Dawes <[email protected]>
Introduction
Motivation and Scope
Impact on the Standard
Important Design Decisions
Proposed Text for TR2
Introductory chapter
Filesystem library chapter
Definitions
Requirements
Requirements on programs
Requirements on implementations
Header <filesystem> synopsis
Path traits
Class template basic_path
Pathname formats
Pathname grammar
Filename conversion
Requirements
basic_path constructors
basic_path assignments
basic_path comparisons
basic_path modifiers
basic_path operators
basic_path observers
basic_path iterators
Class template
basic_filesystem_error
basic_filesystem_error
constructors
basic_filesystem_error observers
Class template
basic_directory_entry
basic_directory_entry constructors
basic_directory_entry modifiers
basic_directory_entry observers
basic_directory_entry comparisons
Class template
basic_directory_iterator
basic_directory_iterator
constructors
Class template
basic_recursive_directory_iterator
Non-member operational functions
Status functions
Predicate functions
Attribute functions
Other operations functions
Convenience functions
Additions to
header <cerrno>
Additions
to header <fstream>
Suggestions for <fstream>
implementations
Path decomposition table
Issues
Acknowledgements
References
This paper proposes addition of a filesystem library component to the C++ Standard Library Technical Report 2. The proposal is based on the Boost Filesystem Library (see www.boost.org/libs/filesystem).
The library provides portable facilities to query and manipulate paths, files, and directories. The Boost version of the library is widely used. It would be a pure addition to the C++ standard, leaving in place existing standard library functionality in the relatively few areas where there is overlap.
Users say they prefer the Boost Filesystem Library interface to native operating system or POSIX API's, even in code without portability requirements, because the design follows modern C++ practice.
The proposed text includes an example of a program using the library.
Why is this important?
The motivation for the library is the desire to perform safe, portable, script-like filesystem operations from within C++ programs. Because the C++ Standard Library currently contains no facilities for such filesystem tasks as directory iteration or directory creation, programmers currently must rely on operating system specific interfaces, making it difficult to write portable programs.
The intent is not to compete with Python, Perl, or shell scripting languages, but rather to provide file system operations where C++ is already the language of choice. The design encourages, but does not require, safe and portable usage.
What kinds of problems does it address, and what kinds of programmers is it intended to support?
The library addresses everyday needs, for both application programs and libraries. It is useful across every application domain that uses files. It is intended to be useful to all levels of programmers, from rank beginners to seasoned experts.
Is it based on existing practice?
Yes, very much so. The proposal is based on the Boost Filesystem Library, which has been in use since 2002 and by now is in very wide use. For example, current versions of Adobe Systems products such as Adobe Reader use the Boost Filesystem Library on the many platforms they support.
Note, however, that until recently all the Boost experience was with a narrow-character only version of the library. The internationalized version as described in this proposal is just starting to be used, and will not be fully released until Boost release 1.34.
The underlying mechanisms have been in use for decades on the world's most wide-spread operating systems, such as POSIX, Windows, and various mainframe operating systems. What this proposal brings to the table is an approach that is C++ Standard Library friendly and fully internationalized.
Is there a reference implementation?
Yes. The Boost Filesystem Library is freely and publicly available. The Boost library will track the TR2 proposed library as the proposal evolves.
What does it depend on, and what depends on it?
It depends on some standard library components, such as basic_string. No other proposals depend on it.
If a revision to the Code Conversion Proposal (See http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2004/n1683.html) is accepted, it may be advantageous for the Filesystem Library to use that library rather than the current code conversion facilities proposed below.
Is it a pure extension, or does it require changes to standard components?
Most of the proposed library is a pure extension.
There are additions to header <cerrno>. Since the critical portions that might require change to C headers (always a sore point) are already mandated for POSIX compliance, and codify existing practice for many non-POSIX implementations such as for Windows, it is not expected that they will cause any problems.
There are additions to header <fstream>.
These have been carefully specified to avoid breaking existing code in common operating environments such as POSIX,
Windows, and OpenVMS. See
Suggestions for <fstream>
implementations for techniques to
avoid breaking existing code in other environments, particularly on operating
systems allowing slashes in filenames.
Can it be implemented using today's compilers, or does it require language features that will only be available as part of C++0x?
It can be (and has been) implemented with today's compilers.
There is one minor function that can best be implemented by an addition to current C++ runtime libraries, although an acceptable workaround is documented.
On operating systems with built-in support for wide-character file names, such as Windows, high-quality implementation of the header <fstream> additions require an addition to the C++ Standard Library implementation. The addition is relatively small and localized. There is a workaround that avoids modifying the standard library, but it is very much a hack and depends on a Windows feature (8.3 filename support) which some users disable, thereby disabling the workaround. The issue doesn't affect implementations on operating systems which only support narrow character file names.
Many of the specific design decisions were driven by the desire to provide a modern C++ interface that works well with the C++ Standard Library. The intent is that Standard Library users can become comfortable with the Filesystem Library in very short order.
The proposed library encourages both syntactic and semantic portability, yet does not force implementors into heroic efforts on hopeless systems. This balances the benefits to users of both code and knowledge portability with the realities faced by implementors on some operating systems.
Because of the desire to support simple "script-like" usage, use cases often
drove design choices. For example, users can write if (exists("foo"))
rather than
the lengthier if (exists(path("foo")))
.
Because filesystem operations often encounter unexpected runtime errors, the library reports runtime errors via C++ exceptions, and ensures enough information is provided for meaningful error messages, including internationalized error messages.
What alternatives did you consider, and what are the tradeoffs?
Additional observers and modifiers for file system attributes. Attribute functions which cannot supply portable semantics are not provided, avoiding the illusion of portability in cases where it cannot in fact exist.
A larger number of operational convenience functions. Convenience functions (functions which can be portably created by composition from basic functions) were not provided unless there was widespread agreement on usefulness and need.
Compile-time or run-time options for operational functions. Numerous trial implementations were abandoned because the added complexity out weighed the benefits, and because consensus could not be reached on the feature set.
Automatic path name checking. This feature, supplied by the Boost library for several years, allow users to specify both default and per constructor path name checking, allowed the desired degree of portability to be automatically enforce. This implicit name checking was abandoned because of user confusion and complaints.
Separate path types for regular file and directory pathnames. Pathname formats that use different syntax for regular pathnames versus directory pathnames are passing into extinction. Why prolong the agony at the cost of torturing those using modern systems? It is perhaps significant that one of the few web sites dedicated to preserving a dual pathname format operating system is named Deathrow (http://deathrow.vistech.net/).
Single path type which can at runtime accept narrow or wide character
pathnames. Although certainly interesting, and possibly superior, such a
design would not interoperate well with the current Standard Library's compile-time
typed basic_string
. A new runtime polymorphic string class would be
the best place to experiment with this concept, not a path class.
What are the consequences of your choices, for users and implementors?
The design has evolved over a period of four years of actual experience by Boost users, and the most frequent causes of user complaints (such as enforced name-checking and several over-strict preconditions) were eliminated. The TR process will allow further refinement. The intent is to ensure user needs are met.
Because the Boost implementation is tested and used in a wide range of POSIX and Windows environments, many implementation concerns have already been addressed.
What decisions are left up to implementors?
Because implementations of the library are dependent on facilities of the underlying operating system, implementors are given unusual freedom to redefine semantics of the library. That being said, implementors are given strong normative encouragement to provide the TR described semantics whenever feasible.
If there are any similar libraries in use, how do their design decisions compare to yours?
There are a number of libraries which address the problem domain. Most of the C/C++ libraries have C, rather than C++ interfaces. For example, see the Apache Portable Runtime Project (http://apr.apache.org). The ACE toolkit (http://www.cs.wustl.edu/~schmidt/ACE.html) uses a C++ approach, but doesn't mesh well with the C++ Standard Library. For example, the ACE directory iterator differs greatly from Standard Library iterator requirements.
Gray-shaded italic text is commentary on the proposal. It is not to be added to the TR.
Italic text is editorial guidance. It is not to be added to the TR.
Add to the introductory section of the TR:
The following standard contains provisions which, through reference in this text, constitute provisions of this Technical Report. At the time of publication, the editions indicated were valid. All standards are subject to revision, and parties to agreements based on this Technical Report are encouraged to investigate the possibility of applying the most recent editions of the standard indicated below. Members of IEC and ISO maintain registers of currently valid International Standards.
[Note: ISO/IEC 9945:2003 is also IEEE Std 1003.1-2001, and The Open Group Base Specifications, Issue 6, and also known as The Single Unix2 Specification, Version 3. It is available from each of those organizations, and may be read online or downloaded from www.unix.org/single_unix_specification/ -- end note]
ISO/IEC 9945:2003, with the indicated corrections, is hereinafter called POSIX.
Some library behavior in this Technical Report is defined by reference to POSIX. How such behavior is actually implemented is unspecified.
[Note: This constitutes an "as if" rule for implementation of operating system dependent behavior. Presumably implementations will actually call native operating system API's. --end note]
Implementations are encouraged, but not required, to support such behavior as it is defined by POSIX. Implementations shall document any behavior that differs from the POSIX defined behavior. Implementations that do not support exact POSIX behavior are encouraged to provide behavior as close to POSIX behavior as is reasonable given the limitations of actual operating systems. If an implementation cannot provide any reasonable behavior, the implementation shall report an error in an implementation-defined manner.
[Note: Such errors might be reported by an #error directive, a
static_assert
, abasic_filesystem_error
exception, a special return value, or some other manner. --end note]
Footnote 1: POSIX® is a registered trademark of The IEEE.
Footnote 2: UNIX® is a registered trademark of The Open Group.
Add a new clause to the TR:
This clause describes components that C++ programs may use to interrogate and manipulate files (including directories), and certain of their attributes.
This clause applies only to hosted implementations (C++ Std, 1.4, Implementation compliance [intro.compliance]).
[Note: This clause applies to any hosted implementation. Specific operating systems such as OpenMVS3, UNIX, and Windows4 are mentioned only for purposes of illustration or to give guidance to implementors. No slight to other operating systems is implied or intended. --end note.]
Unless otherwise specified, all components described in this clause are
declared in namespace std::tr2::sys
.
[Note: The
sys
subnamespace prevents collisions with names already in the standard library and emphasizes reliance on the operating system dependent behavior inherent in file system operations. -- end note]
The Effects and Postconditions of functions described in this clause may not be achieved in the presence of race conditions. No diagnostic is required.
If the possibility of race conditions makes it unreliable for a program to test for a precondition before calling a function described in this clause, Requires is not specified for the condition. Instead, the condition is specified as a Throws condition.
[Note: As a design practice, preconditions are not specified when it is unreasonable for a program to detect them prior to calling the function. -- end note]
Some error conditions, such as empty path function arguments, are specified both in Requires and in Throws elements.
[Note: This dual specification is employed when an error condition is trivially detectable by the C++ program, is not subject to race conditions, and are serious errors or will be detected by most operating system API calls in any case.]
Footnote 3: OpenMVS® is a registered trademark of Hewlett-Packard Development Company.
Footnote 4: Windows® is a registered trademark of Microsoft Corporation.
The following definitions shall apply to this clause:
File: An object that can be written to, or read from, or both. A file has certain attributes, including type. File types include regular file, symbolic link, and directory. Other types of files may be supported by the implementation.
File system: A collection of files and certain of their attributes.
Filename: The name of a file. The format is as specified by the POSIX Filename base definition.
Path: A sequence of elements which identify a location within a filesystem. The elements are the root-name, root-directory, and each successive filename. See Pathname grammar.
Pathname: A character string that represents a path.
Link: A directory entry object that associates a filename with a file. On some file systems, several directory entries can associate names with the same file.
Hard link: A link to an existing file. Some file systems support multiple hard links to a file. If the last hard link to a file is removed, the file itself is removed.
[Note: A hard link can be thought of as a shared-ownership smart pointer to a file. -- end note]
Symbolic link: A type of file with the property that when the file is encountered during pathname resolution, a string stored by the file is used to modify the pathname resolution.
[Note: A symbolic link can be thought of as a raw pointer to a file. If the file pointed to does not exist, the symbolic link is said to be a "dangling" symbolic link. -- end note]
Slash: The character '/', also known as solidus.
Dot: The character '.', also known as period.
Race condition: The condition that occurs when multiple threads, processes, or computers interleave access and modification of the same object within a file system.
The arguments for template parameters named Path
, Path1
,
or Path2
described in this clause shall be of type basic_path
,
or a class derived from basic_path
, unless otherwise
specified.
Some function templates described in this clause have a template parameter
named Path
, Path1
, or Path2
. When called
with a function argument s
of type char*
or
std::string
, the implementation shall treat the argument as if it were
coded path(s)
. When called with a function argument s
of type wchar_t*
or std::wstring
, the implementation
shall treat the argument as if it were coded wpath(s)
. For
functions with two arguments, implementations shall not supply this treatment
when Path1
and Path2
are different types.
[Note: This "do-the-right-thing" rule allows users to write
exists("foo")
, taking advantage of classbasic_path
's string conversion constructor, rather than the lengthier and more error proneexists(path("foo"))
. This is particularly important for the simple, script-like, programs which are an important use case for the library. Calling two argument functions with different types is a very rare usage, and may well be a coding error, so automatic conversion is not supported for such cases.The implementation technique is unspecified. One possible implementation technique, using
exists()
as an example, is:template <class Path> typename boost::enable_if<is_basic_path<Path>,bool>::type exists(const Path& p); inline bool exists(const path& p) { return exists<path>(p); } inline bool exists(const wpath& p) { return exists<wpath>(p); }The
enable_if
will fail for a C string orstd::basic_string
argument, which will then be automatically converted to abasic_path
object via the appropriatebasic_path
conversion constructor. -- end note]The two overloads are not given in the normative text because:
- Better techniques for achieving the desired affect may be developed, perhaps enabled by core language changes like Concepts.
- Implementations may prefer techniques that work with legacy compilers that do not support enable_if.
- Spelling out the overloads makes the text longer and harder to read without adding much benefit.
- More overloads will probably be needed for char16_t and char32_t (or whatever they end up being called), making it even less attractive to actually spell out each one.
Implementations of functions described in this clause are permitted to call the applications
program interface (API) provided by the operating system. If such an operating
system API call results in an error, implementations
shall report the error by throwing exception basic_filesystem_error
,
unless otherwise specified.
[Note: Such exceptions and the conditions that cause them to be thrown are not explicitly described in each Throws element within this clause. Because hardware failures, network failures, race conditions, and a plethora of other errors occur frequently in file system operations, users should be aware that any file system operation, not matter how apparently innocuous, may throw an exception. -- end note]
<filesystem>
synopsisnamespace std { namespace tr2 { namespace sys { template <class String, class Traits> class basic_path; struct path_format_t{}; extern path_format_t portable; extern path_format_t native; struct path_traits; struct wpath_traits; typedef basic_path< std::string, path_traits > path; typedef basic_path< std::wstring, wpath_traits > wpath; template<class Path> struct is_basic_path; template<class Path> struct slash { static const char value = '/'; }; template<class Path> struct dot { static const char value = '.'; }; typedef int errno_type; // type is determined by the C standard typedef implementation-defined system_error_type; // usually int template <class Path> class basic_filesystem_error; typedef basic_filesystem_error<path> filesystem_error; typedef basic_filesystem_error<wpath> wfilesystem_error; typedef bitmask-type status_flags; // C++ std, 17.3.2.1.2 Bitmask types [lib.bitmask.types] // values are for exposition only; actual values are unspecified static const status_flags error_flag(1); static const status_flags not_found_flag(1<<1); static const status_flags directory_flag(1<<2); static const status_flags regular_flag(1<<3); static const status_flags other_flag(1<<4); static const status_flags symlink_flag(1<<5); struct symlink_t{}; extern symlink_t symlink; template <class Path> class basic_directory_entry; typedef basic_directory_entry<path> directory_entry; typedef basic_directory_entry<wpath> wdirectory_entry; template <class Path> class basic_directory_iterator; typedef basic_directory_iterator<path> directory_iterator; typedef basic_directory_iterator<wpath> wdirectory_iterator; template <class Path> class basic_recursive_directory_iterator; typedef basic_recursive_directory_iterator<path> recursive_directory_iterator; typedef basic_recursive_directory_iterator<wpath> wrecursive_directory_iterator; // status functions template <class Path> status_flags status(const Path& p, system_error_type* ec=0); template <class Path> status_flags status(const Path& p, const symlink_t&, system_error_type* ec=0); // predicate functions template <class Path> bool exists(const Path& p); template <class Path> bool is_directory(const Path& p); template <class Path> bool is_regular(const Path& p); template <class Path> bool is_other(const Path& p); template <class Path> bool is_symlink(const Path& p); template <class Path> bool is_empty(const Path& p); template <class Path1, class Path2> bool equivalent(const Path1& p1, const Path2& p2); // attribute functions template <class Path> Path current_path(); template <class Path> const Path& initial_path(); template <class Path> intmax_t file_size(const Path& p); template <class Path> std::time_t last_write_time(const Path& p); template <class Path> void last_write_time(const Path& p, const std::time_t new_time); // operations functions template <class Path> bool create_directory(const Path& dp); template <class Path1, class Path2> void create_hard_link(const Path1& old_fp, const Path2& new_fp); template <class Path> bool remove(const Path& p); template <class Path1, class Path2> void rename(const Path1& from_p, const Path2& to_p); template <class Path1, class Path2> void copy_file(const Path1& from_fp, const Path2& to_fp); template <class Path> Path system_complete(const Path& p); template <class Path> Path complete(const Path& p, const Path& base=initial_path<Path>()); errno_type lookup_errno(system_error_type code); void system_message(system_error_type code, std::string & target); void system_message(system_error_type code, std::wstring & target); // convenience functions template <class Path> bool create_directories(const Path & p); template <class Path> typename Path::string_type extension(const Path & p); template <class Path> typename Path::string_type basename(const Path & p); template <class Path> Path replace_extension(const Path & p, const typename Path::string_type & new_extension); } // namespace sys } // namespace tr2 } // namespace std
This subclause defines requirements on classes representing path behavior
traits, and defines two classes that satisfy those requirements for paths based
on string
and wstring
.. It also defines several path
additional path traits structure templates, and defines several specializations
of them.
Class template basic_path
defined in this clause requires additional
types, values, and behavior to complete the definition of its semantics.
For purposes of exposition, Traits behaves as if it is a class with private members bool m_locked, initialized false, and std::locale m_locale, initialized
Path Behavior Traits Requirements | |
Expression | Requirements |
Traits::external_string_type |
A typedef which is a specialization of basic_string .
The value_type is a character type used by the operating system
to represent pathnames. |
Traits::internal_string_type |
A typedef which is a specialization of basic_string .
The value_type is a character type to be used by the program to
represent pathnames. Required be the same type as the basic_path
String template parameter. |
Traits::to_external( p, is ) |
is , converted by the m_locale
codecvt facet to external_string_type . |
Traits::to_internal( p, xs ) |
xs , converted by the m_locale
codecvt facet to to internal_string_type . |
Traits::imbue(loc) |
Effects: if m_locked , throw. Otherwise,
m_locked = true; m_locale = loc; Returns: void Throws: basic_filesystem_error |
Traits::imbue(loc, std::nothrow) |
Effects: if (!m_locked) m_locale = loc; bool
temp(m_locked); m_locked = true; Returns: temp |
Type is_basic_path
shall be a UnaryTypeTrait (TR1, 4.1).
The primary template shall be derived directly or indirectly from
std::tr1::false_type
. Type is_basic_path
shall be
specialized for path
, wpath
, and any
user-specialized basic_path
types, and such specializations shall
be derived directly or indirectly from std::tr1::true_type
.
Structure templates slash
and dot
are supplied with
values of type char
. If a user-specialized basic_path
has a
value_type
type which is not convertible from char
, the
templates slash
and dot
shall be specialized to
provide value
with type which is convertible to
basic_path::value_type
.
basic_path
namespace std { namespace tr2 { namespace sys { template <class String, class Traits> class basic_path { public: typedef basic_path<String, Traits> path_type; typedef String string_type; typedef typename String::value_type value_type; typedef Traits traits_type; typedef typename Traits::external_string_type external_string_type; // constructors/destructor basic_path(); basic_path(const basic_path& p); basic_path(const string_type& s, path_format_t=portable); basic_path(const value_type* s, path_format_t=portable); template <class InputIterator> basic_path(InputIterator first, InputIterator last, path_format_t=portable); ~basic_path(); // assignments basic_path& operator=(const basic_path& p); basic_path& operator=(const string_type& s); basic_path& operator=(const value_type* s); template <class InputIterator> basic_path& assign(InputIterator first, InputIterator last, path_format_t=portable); // comparisons bool operator<(const basic_path& that) const; bool operator==(const basic_path& that) const; bool operator!=(const basic_path& that) const; bool operator>(const basic_path& that) const; bool operator<=(const basic_path& that) const; bool operator>=(const basic_path& that) const; // modifiers basic_path& operator/=(const basic_path& rhs); basic_path& operator/=(const string_type& s); basic_path& operator/=(const value_type* s); template <class InputIterator> basic_path& append(InputIterator first, InputIterator last, path_format_t=portable); basic_path& remove_leaf(); // observers const string_type string() const; const string_type file_string() const; const string_type directory_string() const; const external_string_type external_file_string() const; const external_string_type external_directory_string() const; string_type root_name() const; string_type root_directory() const; basic_path root_path() const; basic_path relative_path() const; string_type leaf() const; basic_path branch_path() const; bool empty() const; bool is_complete() const; bool has_root_name() const; bool has_root_directory() const; bool has_root_path() const; bool has_relative_path() const; bool has_leaf() const; bool has_branch_path() const; // iterators class iterator; typedef iterator const_iterator; iterator begin() const; iterator end() const; // operators basic_path operator/(const basic_path& rhs) const; basic_path operator/(const string_type& s) const; basic_path operator/(const value_type* s) const; template <class InputIterator> basic_path concat(InputIterator first, InputIterator last, path_format_t=portable); }; } // namespace sys } // namespace tr2 } // namespace std
A basic_path
object stores a possibly empty path.
The internal form of the stored path is unspecified.
Functions described in this clause which access files or their attributes do so by
resolving a basic_path
object into a particular file in a file
hierarchy. The pathname, suitably converted to the string type, format, and
encoding
required by the operating system, is resolved as if by the POSIX
Pathname Resolution mechanism. The encoding of the resulting pathname is determined by the Traits::to_external
conversion function.
[Note: There is no guarantee that the path stored in a
basic_path
object is valid for a particular operating system or file system. -- end note]
Some functions in this clause return basic_path
objects for
paths composed partly or wholly of pathnames obtained from the operating system.
Such pathnames are suitably converted from the actual format and string
type supplied by the operating system. The encoding of the resulting path is determined by the Traits::to_internal
conversion function.
For member functions described as returning "const string_type
" or
"const external_string_type
", implementations are permitted to return
"const string_type&
" or "const external_string_type&
"
respectively.
[Note: This allows implementations to avoid unnecessary copies. Return-by-value is specified as
const
to ensure programs won't break if moved to a return-by-reference implementation. -- end note]
There are two formats for string or sequence arguments that describe a path:
[Note: The POSIX format is the basis for the portable format because it is already an ISO standard, is the basis for the ubiquitous URL format, and is the native format or a subset of the native format for UNIX-like and Windows-like operating systems familiar to large numbers of programmers.
Use of the portable format does not alone guarantee portability; filenames must also be portable. See Filename conversions. Each operating system always follows its own rules. Use of the portable format does not change that. -- end note]
[Note: If an operating system supports only the POSIX pathname format, the portable format and the native format are the same.
Identifying user-provided paths as native format is a common need, and ensures maximum portability, even though not strictly needed except on systems where no duck-rule exists.
Programs using hard-coding native formats are likely to be non-portable. -- end note]
basic_path
constructors
with a path_format_t
argument of native
accept the
native pathname format.
Implementations may define additional
path_format_t
argument values and associated formats.
All other string or sequence arguments that describe a path accept the portable pathname format. Implementations are encouraged to also accept the native pathname format if it is possible to distinguish the two in cases where interpretation differs. An implementation shall document whether or not the native pathname format is also accepted.
[Example:
-- OpenVMS:
"SYS1::DISK1:[JANE.TYLER.HARRY]
" is treated as a native pathname with a system name, drive name, and three directory filenames, rather than a portable pathname with one filename.-- Windows:
"c:\\jane\\tyler\\harry"
is treated as a native pathname with a drive letter, root-directory, and three filenames, rather than a portable pathname with one filename.-- Counter-example 1: An operating system that allows slashes in filenames and uses dot as a directory separator. Distinguishing between portable and native format argument strings or sequences is not possible as there is no other distinguishing syntax. The implementation does not accept native format pathnames unless the
native
argument is present.-- Counter-example 2: An operating system that allows slashes in filenames and uses some unusual character as a directory separator. The implementation does accept native format pathnames without the additional
native
argument, which only has to be used for native format arguments containing slashes in filenames.-- end example]
[Note: This duck-rule ("if it looks like a duck, walks like a duck, and quacks like a duck, it must be a duck") eliminates format confusion as a source of programmer error and support requests. -- end note]
If both the portable and native formats are accepted, implementations shall document what characters or character sequences are used to distinguish between portable and native formats.
[Note: Windows implementations are encouraged to define colons and backslashes as the characters which distinguish native from portable formats. --end note]
The grammar for the portable pathname format is as follows:
pathname:
root-nameopt root-directoryopt relative-pathoptroot-name:
implementation-definedroot-directory:
slash
root-directory slash
implementation-definedrelative-path:
filename
relative-path slash
relative-path slash filenamefilename:
name
dot
dot dotslash:
slash<Path>::value
dot:
dot<Path>::value
The grammar is aligned with the POSIX Filename, Pathname and Pathname Resolution definitions. Any conflict between the grammar and POSIX is unintentional. This technical report defers to POSIX.
The form of the above wording was taken from POSIX, which uses it in several places to defer to the C standard.
[Note: Windows implementations are encouraged to define slash slash name as a permissible root-name. POSIX permits, but does not require, implementations to do the same. Windows implementations are encouraged to define an additional root-directory element root_directory name. It is applicable only to the slash slash name form of root-name.
Windows implementations are encouraged to recognize a name followed by a colon as a native format root-name, and a backslash as a format element equivalent to slash. -- end note]
When converting filenames to the native operating system format, implementations are encouraged, but not required, to convert otherwise invalid characters or character sequences to valid characters or character sequences. Such conversions are implementation-defined.
[Note: Filename conversion allows much wider portability of both programs and filenames that would otherwise be possible.
Implementations are encouraged to base conversion on existing standards or practice. Examples include the Uniform Resource Locator escape syntax of a percent sign (
'%'
) followed by two hex digits representing the character value. On OpenVMS, which does not allow percent signs in filenames, a dollar sign ('$'
) followed by two hex digits is the existing practice, as is converting lowercase letters to uppercase. -- end note.]The Boost implementation for Windows currently does not map invalid characters. Pending feedback from the LWG, Boost may settle on % hex hex as the preferred escape sequence. If so, should there be normative encouragement?
The argument for the template parameter named String
shall be a
class that includes members with the same names, types, values, and semantics as
class template basic_string
.
The argument for the template parameter named Traits
shall be a
class that satisfies the requirements specified in the
Path Behavior Traits Requirements
table.
The argument for template parameters named InputIterator
shall satisfy the
requirements of an input iterator (C++ Std, 24.1.1, Input iterators [lib.input.iterators]) and shall have a value type convertible to
basic_path::value_type
.
Some function templates with a template
parameter named InputIterator
also have non-template overloads. Implementations shall
only select the function template overload if the type named by InputIterator
is not path_format_t
.
[Note: This "do-the-right-thing" rule ensures that the overload expected by the user is selected. The implementation technique is unspecified - implementations may use enable_if or other techniques to achieve the effect. -- end note]
basic_path
constructorsbasic_path();
Postconditions:
empty()
.
basic_path(const string_type& s, path_format_t=portable); basic_path(const value_type * s, path_format_t=portable); template <class InputIterator> basic_path(InputIterator s, InputIterator last, path_format_t=portable);
Remarks: The format of string
s
and sequence [first
,last
) is described in Pathname formats.Effects: The path elements in string
s
or sequence [first
,last
) are stored.
basic_path
assignmentsbasic_path& operator=(const string_type& s); basic_path& operator=(const value_type* s); template <class InputIterator> basic_path& assign(InputIterator first, InputIterator last, path_format_t=portable);
Remarks: The format of string
s
and sequence [first
,last
) is described in Pathname formats.Effects: The path elements in string
s
or sequence [first
,last
) are stored.Returns:
*this
basic_path
comparisons[Note: Path equality and path equivalence have different semantics.
Equality is determined by basic_path's
operator==
, which considers the two path's lexical representations only. Paths "abc" and "ABC" are never equal.Equivalence is determined by the equivalent() non-member function, which determines if two paths resolve to the same file system entity. Paths "abc" and "ABC" may or may not resolve to the same file, depending on the file system.
Programmers wishing to determine if two paths are "the same" must decide if "the same" means "the same representation" or "resolve to the same actual file", and choose the appropriate function accordingly. -- end note]
bool operator<(const basic_path& that) const;
Returns:
std::lexicographical_compare(begin(), end(), that.begin(), that.end())
[Note: Relational operators ease specifying paths as keys in associative containers. Lexicographical comparison is specified because although not full-fledged containers, paths are enough like containers to merit meeting container comparison requirements (23.1 table 65). -- end note]
bool operator==(const basic_path& that) const;
Returns:
!(*this < that) && !(that < *this)
bool operator!=(const basic_path& that) const;
Returns:
!(*this == that)
bool operator>(const basic_path& that) const;
Returns:
that < *this
bool operator<=(const basic_path& that) const;
Returns:
!(that < *this)
bool operator>=(const basic_path& that) const;
Returns:
!(*this < that)
basic_path
modifiersbasic_path& operator/=(const basic_path& rhs);
Effects: The path stored in
rhs
is appended to the stored path.Returns:
*this
basic_path& operator/=(const string_type& s); basic_path& operator/=(const value_type* s); template <class InputIterator> basic_path& append(InputIterator first, InputIterator last, path_format_t=portable);
Remarks: The format of string
s
and sequence [first
,last
) is described in Pathname formats.Effects: The path elements in string
s
or sequence [first
,last
) are appended to the stored path.Returns:
*this
basic_path& remove_leaf();
Effects: If
has_branch_path()
then remove the last filename from the stored path. If that leaves the stored path with one or more trailing slash elements not representing root-directory, remove them.Returns:
*this
[Note: This function is needed to efficiently implement
basic_directory_iterator
. It is made public to allow additional uses. -- end note]
basic_path
observersSee the Path decomposition table for examples for values returned by decomposition functions.
const string_type string() const;
Returns: The stored path, formatted according to the Pathname grammar rules.
const string_type file_string() const;
Returns: The stored path, formatted according to the operating system rules for regular file pathnames, with any Filename conversion applied.
[Note: For some operating systems, including POSIX and Windows, the native format for regular file pathnames and directory pathnames is the same, so
file_string()
anddirectory_string()
return the same string. On OpenMVS, however, the expressionpath("/cats/jane").file_string()
would return the string"[CATS]JANE"
whilepath("/cats/jane").directory_string()
would return the string"[CATS.JANE]"
. -- end note]
const string_type directory_string() const;
Returns: The stored path, formatted according to the operating system rules for directory pathnames, with any Filename conversion applied.
const external_string_type external_file_string() const;
Returns: The stored path, formatted according to the operating system rules for regular file pathnames, with any Filename conversion applied, and encoded by the
Traits::to_external
conversion function.
const external_string_type external_directory_string() const;
Returns: The stored path, formatted according to the operating system rules for directory pathnames, with any Filename conversion applied, and encoded by the
Traits::to_external
conversion function.
string_type root_name() const;
Returns: root-name, if the stored path includes root-name, otherwise
string_type()
.
string_type root_directory() const;
Returns: root-directory, if the stored path includes root-directory, otherwise
string_type()
.If root-directory is composed slash name, slash is excluded from the returned string.
basic_path root_path() const;
Returns:
root_name() / root_directory()
basic_path relative_path() const;
Returns: A
basic_path
composed from the the stored path, if any, beginning with the first filename after root-path. Otherwise, an emptybasic_path
.
string_type leaf() const;
Returns:
empty() ? string_type() : *--end()
basic_path branch_path() const;
Returns:
(string().empty() || begin() == --end()) ? path_type("") : br
, wherebr
is constructed as if by starting with an emptybasic_path
and successively applyingoperator/=
for each element in the rangebegin()
,--end()
.
bool empty() const;
Returns:
string().empty()
.
bool is_complete() const;
Returns:
true
, if the elements of root_path() uniquely identify a directory, elsefalse
.
bool has_root_path() const;
Returns:
!root_path().empty()
bool has_root_name() const;
Returns:
!root_name().empty()
bool has_root_directory() const;
Returns:
!root_directory().empty()
bool has_relative_path() const;
Returns:
!relative_path().empty()
bool has_leaf() const;
Returns:
!leaf().empty()
bool has_branch_path() const;
Returns:
!branch_path().empty()
basic_path
iterators A basic_path::iterator
is a constant iterator satisfying all
the requirements of a bidirectional iterator (C++ Std, 24.1.4 Bidirectional
iterators [lib.bidirectional.iterators]). Its value_type
is
string_type
.
Calling any non-const member function of a basic_path
object
invalidates all iterators referring to elements of the object.
The forward traversal order is as follows:
The backward traversal order is the reverse of forward traversal.
iterator begin() const;
Returns: An iterator for the first present element in the traversal list above. If no elements are present, the end iterator.
iterator end() const;
Returns: The end iterator.
basic_path
operatorsbasic_path operator /(const basic_path& rhs) const; basic_path operator /(const string_type& s) const; basic_path operator /(const value_type* s) const; template <class InputIterator> basic_path concat(InputIterator first, InputIterator last, path_format_t=portable);
Remarks: The format of string
s
and sequence [first
,last
) is described in Pathname formats.Returns:
basic_path(*this)
withrhs
,s
, or [first
,last
) appended, as if byoperator/=
orappend
.
basic_filesystem_error
namespace std { namespace tr2 { namespace sys { template <class Path> class basic_filesystem_error : public std::runtime_error { public: typedef Path path_type; explicit basic_filesystem_error(const std::string& msg, system_error_type ec=0); basic_filesystem_error(const std::string& msg, const path_type& p1, system_error_type ec); basic_filesystem_error(const std::string& msg, const path_type& p1, const path_type& p2, system_error_type ec); basic_filesystem_error(const basic_filesystem_error& bfe); basic_filesystem_error& operator=(const basic_filesystem_error& bfe); ~basic_filesystem_error(); const std::string& message() const; system_error_type system_error() const; const path_type& path1() const; const path_type& path2() const; }; } // namespace sys } // namespace tr2 } // namespace std
The class template basic_filesystem_error
defines the type of
objects thrown as exceptions to report file system errors from functions described in this
clause.
basic_filesystem_error
constructorsexplicit basic_filesystem_error(const std::string& msg, system_error_type ec=0);
Postconditions:
Expression Value message()
Reference to stored copy of msg
system_error()
ec
path1().empty()
true
path2().empty()
true
basic_filesystem_error(const std::string& msg, const path_type& p1, system_error_type ec);
Postconditions:
Expression Value message()
Reference to stored copy of msg
system_error()
ec
path1()
Reference to stored copy of p1
path2().empty()
true
basic_filesystem_error(const std::string& msg, const path_type& p1, const path_type& p2, system_error_type ec);
Postconditions:
Expression Value message()
Reference to stored copy of msg
system_error()
ec
path1()
Reference to stored copy of p1
path2()
Reference to stored copy of p2
basic_filesystem_error
observersconst std::string& message() const;
Returns: Reference to copy of
msg
stored by the constructor, or, if none, an empty string.
system_error_type system_error() const;
Returns: The value of
ec
stored by the constructor.
const path_type& path1() const;
Returns: Reference to copy of
p1
stored by the constructor, or, if none, an empty path.
const path_type& path2() const;
Returns: Reference to copy of
p2
stored by the constructor, or, if none, an empty path.
basic_directory_entry
namespace std { namespace tr2 { namespace sys { template <class Path> class basic_directory_entry { public: typedef Path path_type; typedef typename Path::string_type string_type; // constructors basic_directory_entry(); explicit basic_directory_entry(const path_type& p, status_flags sf=0, status_flags symlink_sf=0); // modifiers void assign(const path_type& p, status_flags sf=0, status_flags symlink_sf=0); void replace_leaf(const string_type& s, status_flags sf=0, status_flags symlink_sf=0); // observers const Path& path() const; operator const Path&() const; status_flags status(system_error_type* ec=0) const; status_flags status(const symlink_t&, system_error_type* ec=0) const; bool exists() const; bool is_directory() const; bool is_regular() const; bool is_other() const; bool is_symlink() const; // comparisons bool operator<(const basic_directory_entry<Path>& rhs); bool operator==(const basic_directory_entry<Path>& rhs); bool operator!=(const basic_directory_entry<Path>& rhs); bool operator>(const basic_directory_entry<Path>& rhs); bool operator<=(const basic_directory_entry<Path>& rhs); bool operator>=(const basic_directory_entry<Path>& rhs); private: path_type m_path; // for exposition only mutable status_flags m_status; // for exposition only; stat()-like mutable status_flags m_symlink_status; // for exposition only; lstat()-like }; } // namespace sys } // namespace tr2 } // namespace std
A basic_directory_entry
object stores a basic_path object
,
a status_flags
object for non-symbolic link status, and a
status_flags
object for symbolic link status. The status_flags
objects act as value caches.
[Note: Because
status()
may be a very expensive operation, caching of status flags can result is significant time savings. Cached and non-cached results may differ in the presence of race conditions. -- end note]Actual cold-boot timing of iteration over a directory with 15,047 entries was six seconds for non-cached status queries versus one second for cached status queries. Windows XP, 3.0 GHz processor, with a moderately fast hard-drive. Similar speedup expected on Linux and BSD-derived Unix variants that provide status during directory iteration.
basic_directory_entry
constructorsbasic_directory_entry();
Postconditions:
Expression Value path().empty()
true
status()
0
status(symlink)
0
explicit basic_directory_entry(const path_type& p, status_flags sf=0, status_flags symlink_sf=0);
Postconditions:
Expression Value path()
p
status()
sf
status(symlink)
symlink_sf
basic_directory_entry
modifiersvoid assign(const path_type& p, status_flags sf=0, status_flags symlink_sf=0);
Postconditions:
Expression Value path()
p
status()
sf
status(symlink)
symlink_sf
void replace_leaf(const string_type& s, status_flags sf=0, status_flags symlink_sf=0);
Postconditions:
Expression Value path()
path().branch() / s
status()
sf
status(symlink)
symlink_sf
basic_directory_entry
observersconst Path& path() const; operator const Path&() const;
Returns:
m_path
status_flags status(system_error_type* ec=0) const;
Effects: if
m_status
is zero, setm_status
tosys::status(ec)
Returns:
m_status
status_flags status(const symlink_t&, system_error_type* ec=0) const;
Effects: if
m_symlink_status
is zero, setm_symlink_status
tosys::status(symlink, ec)
Returns:
m_symlink_status
bool exists() const;
Returns:
this->status() != not_found_flag
bool is_directory() const;
Returns:
(this->status() & directory_flag) != 0
bool is_regular() const;
Returns:
(this->status() & regular_flag) != 0
bool is_other() const;
Returns:
(this->status() & other_flag) != 0
bool is_symlink() const;
Returns:
(this->symlink_status() & symlink_flag) != 0
basic_directory_entry
comparisonsbool operator<(const basic_directory_entry<Path>& rhs);
Returns:
path()<rhs.path()
bool operator==(const basic_directory_entry<Path>& rhs);
Returns:
path()==rhs.path()
bool operator!=(const basic_directory_entry<Path>& rhs);
Returns:
path()!=rhs.path()
bool operator>(const basic_directory_entry<Path>& rhs);
Returns:
path()>rhs.path()
bool operator<=(const basic_directory_entry<Path>& rhs);
Returns:
path()<=rhs.path()
bool operator>=(const basic_directory_entry<Path>& rhs);
Returns:
path()>=rhs.path()
basic_directory_iterator
namespace std { namespace tr2 { namespace sys { template <class Path> class basic_directory_iterator : public iterator<input_iterator_tag, basic_directory_entry<Path> > { public: typedef Path path_type; // constructors basic_directory_iterator(); explicit basic_directory_iterator(const Path& dp); basic_directory_iterator(const basic_directory_iterator& bdi); basic_directory_iterator& operator=(const basic_directory_iterator& bdi); ~basic_directory_iterator(); // other members as required by // C++ Std, 24.1.1 Input iterators [lib.input.iterators] }; } // namespace sys } // namespace tr2 } // namespace std
basic_directory_iterator
satisfies the requirements of an
input iterator (C++ Std, 24.1.1, Input iterators [lib.input.iterators]).
A basic_directory_iterator
reads successive elements from the directory for
which it was constructed, as if by calling POSIX
readdir_r()
. After a basic_directory_iterator
is constructed, and every time
operator++
is called,
it reads and stores a value of basic_directory_entry<Path>
and possibly stores associated status values.
operator++
is not equality preserving; that is, i == j
does not imply that
++i == ++j
.
[Note: The practical consequence of not preserving equality is that directory iterators can be used only for single-pass algorithms. --end note]
If the end of the directory elements is reached, the iterator becomes equal to
the end iterator value. The constructor basic_directory_iterator()
with no arguments always constructs an end iterator object, which is the only
legitimate iterator to be used for the end condition. The result of
operator*
on an end iterator is not defined. For any other iterator value
a const basic_directory_entry<Path>&
is returned. The result of
operator->
on an end iterator is not defined. For any other
iterator value a const basic_directory_entry<Path>*
is
returned.
Two end iterators are always equal. An end iterator is not equal to a non-end iterator.
The above wording is based on the Standard Library's istream_iterator wording. Commentary was shortened and moved into a note.
The result of calling the path()
member of the
basic_directory_entry
object obtained by dereferencing a
basic_directory_iterator
is a reference to a basic_path
object composed of the directory argument from which the iterator was
constructed with filename of the directory entry appended as if by
operator/=
.
[Example: This program accepts an optional command line argument, and if that argument is a directory pathname, iterates over the contents of the directory. For each directory entry, the name is output, and if the entry is for a regular file, the size of the file is output.
#include <iostream> #include <filesystem> using std::tr2::sys; using std::cout; int main(int argc, char* argv[]) { std::string p(argc <= 1 ? "." : argv[1]); if (is_directory(p)) { for (directory_iterator itr(p); itr!=directory_iterator(); ++itr) { cout << itr->path().leaf() << ' '; // display filename only if (itr->is_regular_file()) cout << " [" << file_size(itr->path()) << ']'; cout << '\n'; } } else cout << (exists(p) : "Found: " : "Not found: ") << p << '\n'; return 0; }-- end example]
Directory iteration shall not yield directory entries for the current (dot) and parent (dot dot) directories.
The order of directory entries obtained by dereferencing successive
increments of a basic_directory_iterator
is unspecified.
[Note: Programs performing directory iteration may wish to test if the path obtained by dereferencing a directory iterator actually exists. It could be a symbolic link to a non-existent file. Programs recursively walking directory trees for purposes of removing and renaming entries may wish to avoid following symbolic links.
If a file is removed from or added to a directory after the construction of a
basic_directory_iterator
for the directory, it is unspecified whether or not subsequent incrementing of the iterator will ever result in an iterator whose value is the removed or added directory entry. See POSIXreaddir_r()
. --end note]
basic_directory_iterator
constructorsbasic_directory_iterator();
Effects: Constructs the end iterator.
explicit basic_directory_iterator( const Path & dp );
Effects: Constructs a iterator with a value representing the first entry in the directory resolved to by
dp
, or, if the directory is empty, the end iterator value.[Note: To iterate over the current directory, write
directory_iterator(".")
rather thandirectory_iterator("")
. -- end note]
basic_recursive_directory_iterator
namespace std { namespace tr2 { namespace sys { template <class Path> class basic_recursive_directory_iterator : public iterator<input_iterator_tag, basic_directory_entry<Path> > { public: typedef Path path_type; // constructors basic_recursive_directory_iterator(); explicit basic_recursive_directory_iterator(const Path& dp); basic_recursive_directory_iterator(const basic_recursive_directory_iterator& brdi); basic_recursive_directory_iterator& operator=(const basic_recursive_directory_iterator& brdi); ~basic_recursive_directory_iterator(); // observers int level() const; // modifiers void pop(); void no_push(); // other members as required by // C++ Std, 24.1.1 Input iterators [lib.input.iterators] private: int m_level; // for exposition only }; } // namespace sys } // namespace tr2 } // namespace std
The behavior of a basic_recursive_directory_iterator
is the same
as a basic_directory_iterator
unless otherwise specified.
m_level
is set to 0;it
is incremented, if it->is_directory()
is true and no_push()
had not been called subsequent to
the most recent increment operation (or construction, if no increment has
occurred), then m_level
is incremented, the
directory is visited, and its contents recursively iterated over.pop()
is called, m_level
is
decremented, and iteration continues with the parent directory, until the
directory specified in the constructor argument is reached.level()
returns m_level
.level()
, pop()
, and no_push()
all
require that the iterator not be the end iterator.[Note: One of the uses of
no_push()
is to prevent unwanted recursion into symlinked directories. This may be necessary to prevent loops on some operating systems. -- end note]
template <class Path> status_flags status(const Path& p, system_error_type* ec=0); template <class Path> status_flags status(const Path& p, const symlink_t&, system_error_type* ec=0);
Returns:
Ifp.empty()
:If the
- If
ec != 0
, set*ec
to the operating system error code equivalent to POSIX ENOENT.- Return
not_found_flag
.symlink_t
argument is not present, determine the attributes ofp
as if by POSIXstat()
, else determine the attributes as if by POSIXlstat()
.[Note: For symbolic links,
stat()
continues pathname resolution using the contents of the symbolic link,lstat()
does not. -- end note]If the attribute determination reports an error:
Otherwise:
- If
ec != 0
, set*ec
to the error code reported by the operating system.- If the operating system reports an error indicating that
p
could not be resolved, as if by POSIX error codes ENOENT or ENOTDIR, returnnot_found_flag
, else returnerror_flag
.
- Set a
status_flags
temporary to 0.- If the attributes indicate a symbolic link, as if by POSIX S_ISLNK(), return
symlink_flag
.- If the attributes indicate a directory, as if by POSIX S_ISDIR(), set the temporary to
directory_flag
.- If the attributes indicate a regular file, as if by POSIX S_ISREG(), or
regular_flag
to the temporary.- If the temporary is 0, set it to
other_flag
.- Return the temporary.
[Note:
directory_flag
impliesbasic_directory_iterator
on the file would succeed, andregular_flag
implies appropriate<fstream>
operations would succeed, assuming no hardware, permission, access errors, or no race conditions. Forregular_flag,
the converse is not true; lack ofregular_flag
does not necessarily imply<fstream>
operations would fail on a directory or other file. -- end note]
template <class Path> bool exists(const Path& p);
Effects: Determines
status_flags sf
, as if bystatus(p)
.Throws:
basic_filesystem_error<Path>
ifsf == error_flag
.Returns:
sf != not_found_flag
template <class Path> bool is_directory(const Path& p);
Effects: Determines
status_flags sf
, as if bystatus(p)
.Throws:
basic_filesystem_error<Path>
ifsf == error_flag
.Returns:
(sf & directory_flag) != 0
template <class Path> bool is_regular(const Path& p);
Effects: Determines
status_flags sf
, as if bystatus(p)
.Throws:
basic_filesystem_error<Path>
ifsf == error_flag
.Returns:
(sf & regular_flag) != 0
template <class Path> bool is_other(const Path& p);
Effects: Determines
status_flags sf
, as if bystatus(p)
.Throws:
basic_filesystem_error<Path>
ifsf == error_flag
.Returns:
(sf & other_flag) != 0
template <class Path> bool is_symlink(const Path& p);
Effects: Determines
status_flags sf
, as if bystatus(p, symlink)
.Throws:
basic_filesystem_error<Path>
ifsf == error_flag
.Returns:
(sf & symlink_flag) != 0
template <class Path> bool empty(const Path& p);
Effects: Determines
status_flags sf
, as if bystatus(p)
.Throws:
basic_filesystem_error<Path>
ifsf == error_flag || sf == not_found_flag || sf == other_flag
.Returns:
(sf & directory_flag) != 0
? basic_directory_iterator<Path>(p) == basic_directory_iterator<Path>(p)
: file_size(p) == 0;
template <class Path1, class Path2> bool equivalent(const Path1& p1, const Path2& p2);
Requires:
Path1::external_string_type
andPath2::external_string_type
are the same type.Effects: Determines
status_flags sf1
andsf2
, as if bystatus(p1)
andstatus(p2)
, respectively.
Then, throws ifsf1 == error_flag ||sf2 == error_flag || (sf1 == not_found_flag && sf2 == not_found_flag) || (sf1 == other_flag && sf2 == other_flag)
.Throws:
basic_filesystem_error<Path1>
Returns:
true
, ifsf1 == sf2
andp1
andp2
resolve to the same file system entity, elsefalse
.Two paths are considered to resolve to the same file system entity if two candidate entities reside on the same device at the same location. This is determined as if by the values of the POSIX
stat
structure,
obtained as if bystat()
for the two paths, having equalst_dev
values and equalst_ino
values.[Note: POSIX requires that "st_dev must be unique within a Local Area Network". Conservative POSIX implementations may also wish to check for equal
st_size
andst_mtime
values. Windows implementations may useGetFileInformationByHandle()
as a surrogate forstat()
, and consider "same" to be equal values fordwVolumeSerialNumber
,nFileIndexHigh
,nFileIndexLow
,nFileSizeHigh
,nFileSizeLow
,ftLastWriteTime.dwLowDateTime
, andftLastWriteTime.dwHighDateTime
. -- end note]
[Note: A strictly limited number of attribute functions are provided because few file system attributes are even somewhat portable. Even the functions provided will be impossible to implement on some file systems. --end note.]
template <class Path> const Path& initial_path();
Returns:
current_path()
at the time of entry tomain()
.[Note: These semantics turn a dangerous global variable into a safer global constant. --end note]
[Note: Full implementation requires runtime library support. Implementations which cannot provide runtime library support are encouraged to instead store the value of
current_path()
at the first call ofinitial_path
()
, and return this value for all subsequent calls. Programs usinginitial_path
()
are encouraged to call it immediately on entrance tomain()
so that they will work correctly with such partial implementations. --end note]
template <class Path> Path current_path();
Returns: The current path, as if by POSIX
getcwd()
.Postcondition:
current_path().is_complete()
[Note: The current path as returned by many operating systems is a dangerous global variable. It may be changed unexpectedly by a third-party or system library functions, or by another thread. Although dangerous, the function is useful in dealing with other libraries.. For a safer alternative, see
initial_path()
. Thecurrent_path()
name was chosen to emphasize that the return is a complete path, not just a single directory name. -- end note]
template <class Path> intmax_t file_size(const Path& p);
Returns: The size in bytes of the file
p
resolves to, determined as if by the value of the POSIXstat
structure memberst_size
obtained as if by POSIXstat()
.
template <class Path> std::time_t last_write_time(const Path& p);
Returns: The time of last data modification of
p
, determined as if by the value of the POSIXstat
structure memberst_mtime
obtained as if by POSIXstat()
.
template <class Path> void last_write_time(const Path& p, const std::time_t new_time);
Effects: Sets the time of last data modification of the file resolved to by
p
tonew_time
, as if by POSIXstat()
followed by POSIXutime()
.[Note: The apparent postcondition
last_write_time(p) == new_time
is not specified since it would not hold for many file systems due to coarse time mechanism granularity. -- end note]
template <class Path> bool create_directory(const Path& dp);
Requires:
!dp.empty()
Effects: Attempts to create the directory
dp
resolves to, as if by POSIXmkdir()
with a second argument of S_IRWXU|S_IRWXG|S_IRWXO.Throws:
basic_filesystem_error<Path>
if Effects fails for any reason other than because the directory already exists.Returns: True if a new directory was created, otherwise false.
Postcondition:
is_directory(dp)
template <class Path1, class Path2> void create_hard_link(const Path1& old_p, const Path2& new_p);
Requires:
Path1::external_string_type
andPath2::external_string_type
are the same type.!old_p.empty() && !new_p.empty()
Effects: Establishes the postcondition, as if by POSIX
link()
.Postcondition:
exists(old_p) && exists(new_p) && equivalent(old_p, new_p)
- The contents of the file or directory
old_p
resolves to are unchanged.[Note: Many operating systems do not support hard links or support them only for regular files. Some operating systems limit the number of links per file to a fairly small value - 1023 on Windows NTFS, for example. Operating systems cannot support hard links on file systems that do not support them - the FAT system used on floppy discs, memory cards and flash drives, is a common example. Thus hard links should be avoided if wide portability is a concern. -- end note]
template <class Path> bool remove(const Path& p);
Precondition:
!p.empty()
Effects: Attempts to delete the file
p
resolves to, as if by POSIXremove()
.Returns: The value of
exists(p)
prior to the establishment of the postcondition.Postcondition:
!exists(p)
Throws:
basic_filesystem_error<Path>
if:
p.empty() || (exists(p) && is_directory(p) && !empty(p))
.- Effects fails for any reason other than because
p
does not resolve to an existing file.[Note: A symbolic link is itself removed, rather than what it resolves to being removed. -- end note]
template <class Path1, class Path2> void rename(const Path1& from_p, const Path2& to_p);
Requires:
Path1::external_string_type
andPath2::external_string_type
are the same type.!from_p.empty() && !to_p.empty()
Effects: Renames
from_p
toto_p
, as if by POSIXrename()
.Postconditions:
!exists(from_p) && exists(to_p)
, and the contents and attributes of the file originally namedfrom_p
are otherwise unchanged.[Note: If
from_p
andto_p
resolve to the same file, no action is taken. Otherwise, ifto_p
resolves to an existing file, it is removed. A symbolic link is itself renamed, rather than the file it resolves to being renamed. -- end note]
template <class Path1, class Path2> void copy_file(const Path1& from_fp, const Path2& to_fp);
Requires:
Path1::external_string_type
andPath2::external_string_type
are the same type.!from_fp.empty() && !to_fp.empty()
Effects: The contents and attributes of the file
from_fp
resolves to are copied to the fileto_fp
resolves to.Throws:
basic_filesystem_error<Path>
iffrom_fp.empty() || to_fp.empty() ||!exists(from_fp) || !is_regular(from_fp) || exists(to_fp)
template <class Path> Path complete(const Path& p, const Path& base=initial_path<Path>());
Requires:
base.is_complete() && (p.is_complete() || !p.has_root_name())
Effects: Composes a complete path from
p
andbase
, using the following rules:
p.has_root_directory()
!p.has_root_directory()
p.has_root_name()
p
precondition failure !p.has_root_name()
base.root_name()
/ pbase / p
Returns: The composed path.
Postcondition: For the returned path,
rp,
rp.is_complete()
is true.Throws: On precondition failure (see clause introduction).
[Note: When portable behavior is required, use complete(). When operating system dependent behavior is required, use system_complete().
Portable behavior is useful when dealing with paths created internally within a program, particularly if the program should exhibit the same behavior on all operating systems.
Operating system dependent behavior is useful when dealing with paths supplied by user input, reported to program users, or when such behavior is expected by program users. -- end note]
template <class Path> Path system_complete(const Path& p);
Requires:
!p.empty()
Effects: Composes a complete path from
p
, using the same rules used by the operating system to resolve a path passed as the filename argument to standard library open functions.Returns: The composed path.
Postcondition: For the returned path,
rp,
rp.is_complete()
is true.Throws: On precondition failure (see clause introduction).
[Note: For POSIX,
system_complete(p)
has the same semantics ascomplete(p, current_path())
.For Widows,
system_complete(p)
has the same semantics ascomplete(ph, current_path())
ifp.is_complete() || !p.has_root_name()
orp
andbase
have the sameroot_name()
. Otherwise it acts likecomplete(p, kinky)
, wherekinky
is the current directory for thep.root_name()
drive. This will be the current directory of that drive the last time it was set, and thus may be residue left over from a prior program run by the command processor! Although these semantics are often useful, they are also very error-prone.See complete() note for usage suggestions. -- end note]
errno_type to_errno( system_error_type code );
Returns: The value of the
errno
error number which corresponds to the operating system's error codecode
. The exact correspondence is implementation defined. Implementations are only required to support error codes reported bybasic_filesystem_error
exceptions thrown by functions defined in this clause.
void system_message( system_error_type ec, std::string & target ); void system_message( system_error_type ec, std::wstring & target );
Effects: Appends a message corresponding to
ec
totarget
.[Note: Implementations are encouraged to supply a localized message. -- end note]
template <class Path> bool create_directories(const Path & p);
Requires:
p.empty() ||
forall px: px == p || is_parent(px, p): is_directory(px) || !exists( px )Returns: The value of
!exists(p)
prior to the establishment of the postcondition.Postcondition:
is_directory(p)
Throws:
basic_filesystem_error<Path>
ifexists(p) && !is_directory(p)
template <class Path> typename Path::string_type extension(const Path & p);
Returns: if
p.leaf()
contains a dot, returns the substring ofp.leaf()
starting at the rightmost dot and ending at the string's end. Otherwise, returns an empty string.[Note: The dot is included in the return value so that it is possible to distinguish between no extension and an empty extension.
Implementations are permitted but not required to define additional behavior for file systems which append additional elements to extensions, such as alternate data stream or partitioned dataset names. -- end note]
template <class Path> typename Path::string_type basename(const Path & p);
Returns: if
p.leaf()
contains a dot, returns the substring ofp.leaf()
starting at its beginning and ending at the last dot (the dot is not included). Otherwise, returnsp.leaf()
.
template <class Path> Path replace_extension(const Path & p, const typename Path::string_type & new_extension);
Postcondition:
basename(return_value) == basename(p) && extension(return_value) == new_extension
[Note: It follows from the semantics of
extension()
thatnew_extension
should include dot to achieve reasonable results. -- end note]
<cerrno>
The header <cerrno> shall include an additional symbolic constant macro for each of the values returned by the to_errno function. The macro names shall be as defined in POSIX errno.h, with the additions below.
This codifies existing practice. The required names are only a sub-set of those defined by POSIX, and are usually already supplied in <errno.h> (as wrapped by <cerrno>) as shipped with POSIX and Windows compilers. These implementations require no changes to their underlying C headers to conform with the above requirement.
Name Meaning EBADHANDLE
Bad operating system handle. EOTHER
Other error.
<fstream>
These additions have been carefully specified to avoid breaking existing code in common operating environments such as POSIX, Windows, and OpenVMS. See Suggestions for
<fstream>
implementations for techniques to avoid breaking existing code in other environments, particularly on operating systems allowing slashes in filenames.[Note: The "do-the-right-thing" rule from Requirements on implementations does apply to header
<fstream>
.The overloads below are specified as additions rather than replacements for existing functions. This preserves existing code (perhaps using a home-grown path class) that relies on an automatic conversion to
const char*
. -- end note]
In 27.8.1.1 Class template basic_filebuf [lib.filebuf] synopsis preceding paragraph 1, add the function:
template <class Path> basic_filebuf<charT,traits>* open(const Path& p, ios_base::openmode mode);
In 27.8.1.3 Member functions [lib.filebuf.members], add the above to the signature preceding paragraph 2, and replace the sentence:
It then opens a file, if possible, whose name is the NTBS s (“as if” by calling
std::fopen(s ,modstr ))
.
with:
It then opens, if possible, the file that
p
orpath(s)
resolves to, “as if” by callingstd::fopen()
with a second argument of modstr.
In 27.8.1.5 Class template basic_ifstream [lib.ifstream] synopsis preceding paragraph 1, add the functions:
template <class Path> explicit basic_ifstream(const Path& p, ios_base::openmode mode = ios_base::in);
template <class Path> void open(const Path& p, ios_base::openmode mode = ios_base::in);
In 27.8.1.6 basic_ifstream constructors [lib.ifstream.cons] add the above constructor to the signature preceding paragraph 2, and in paragraph 2 replace
rdbuf()->open(s, mode | ios_base::in)
with
rdbuf()->open(path(s), mode | ios_base::in)
orrdbuf()->open(p, mode | ios_base::in)
as appropriate
In 27.8.1.7 Member functions [lib.ifstream.members] add the above open function to the signature preceding paragraph 3, and in paragraph 3 replace
rdbuf()->open(s, mode | ios_base::in)
with
rdbuf()->open(path(s), mode | ios_base::in)
orrdbuf()->open(p, mode | ios_base::in)
as appropriate
In 27.8.1.8 Class template basic_ofstream [lib.ofstream] synopsis preceding paragraph 1, add the functions:
template <class Path> explicit basic_ofstream(const Path& p, ios_base::openmode mode = ios_base::out);
template <class Path> void open(const Path& p, ios_base::openmode mode = ios_base::out);
In 27.8.1.9 basic_ofstream constructors [lib.ofstream.cons] add the above constructor to the signature preceding paragraph 2, and in paragraph 2 replace
rdbuf()->open(s, mode | ios_base::out)
with
rdbuf()->open(path(s), mode | ios_base::out)
orrdbuf()->open(p, mode | ios_base::out)
as appropriate
In 27.8.1.10 Member functions [lib.ofstream.members] add the above open function to the signature preceding paragraph 3, and in paragraph 3 replace
rdbuf()->open(s, mode | ios_base::out)
with
rdbuf()->open(path(s), mode | ios_base::out)
orrdbuf()->open(p, mode | ios_base::out)
as appropriate
In 27.8.1.11 Class template basic_fstream [lib.fstream] synopsis preceding paragraph 1, add the functions:
template <class Path> explicit basic_fstream(const Path& p, ios_base::openmode mode = ios_base::in|ios_base::out);
template <class Path> void open(const Path& p, ios_base::openmode mode = ios_base::in|ios_base::out);
In 27.8.1.12 basic_fstream constructors [lib.fstream.cons] add the above constructor to the signature preceding paragraph 2, and in paragraph 2 replace
rdbuf()->open(s, mode)
with
rdbuf()->open(path(s), mode)
orrdbuf()->open(p, mode)
as appropriate
In 27.8.1.13 Member functions [lib.fstream.members] add the above open function to the signature preceding paragraph 3, and in paragraph 3 replace
rdbuf()->open(s, mode)
with
rdbuf()->open(path(s), mode)
orrdbuf()->open(p, mode)
as appropriate
End of proposed text.
The table is generated by a program compiled with the Boost implementation.
Shaded entries indicate cases where POSIX and Windows
implementations yield different results. The top value is the
POSIX result and the bottom value is the Windows result.
Constructor argument |
Elements found by iteration |
string() |
file_ |
root_ |
root_ |
root_ |
relative_ |
branch_ |
leaf() |
"" |
"" |
"" |
"" |
"" |
"" |
"" |
"" |
"" |
"" |
"." |
"." |
"." |
"." |
"" |
"" |
"" |
"." |
"" |
"." |
".." |
".." |
".." |
".." |
"" |
"" |
"" |
".." |
"" |
".." |
"foo" |
"foo" |
"foo" |
"foo" |
"" |
"" |
"" |
"foo" |
"" |
"foo" |
"/" |
"/" |
"/" |
"/" |
"/" |
"" |
"/" |
"" |
"" |
"/" |
"/foo" |
"/","foo" |
"/foo" |
"/foo" |
"/" |
"" |
"/" |
"foo" |
"/" |
"foo" |
"foo/" |
"foo","." |
"foo/" |
"foo/" |
"" |
"" |
"" |
"foo/" |
"foo" |
"." |
"/foo/" |
"/","foo","." |
"/foo/" |
"/foo/" |
"/" |
"" |
"/" |
"foo/" |
"/foo" |
"." |
"foo/bar" |
"foo","bar" |
"foo/bar" |
"foo/bar" |
"" |
"" |
"" |
"foo/bar" |
"foo" |
"bar" |
"/foo/bar" |
"/","foo","bar" |
"/foo/bar" |
"/foo/bar" |
"/" |
"" |
"/" |
"foo/bar" |
"/foo" |
"bar" |
"///foo///" |
"/","foo","." |
"///foo///" |
"///foo///" |
"/" |
"" |
"/" |
"foo///" |
"///foo" |
"." |
"///foo///bar" |
"/","foo","bar" |
"///foo///bar" |
"///foo///bar" |
"/" |
"" |
"/" |
"foo///bar" |
"///foo" |
"bar" |
"/." |
"/","." |
"/." |
"/." |
"/" |
"" |
"/" |
"." |
"/" |
"." |
"./" |
".","." |
"./" |
"./" |
"" |
"" |
"" |
"./" |
"." |
"." |
"/.." |
"/",".." |
"/.." |
"/.." |
"/" |
"" |
"/" |
".." |
"/" |
".." |
"../" |
"..","." |
"../" |
"../" |
"" |
"" |
"" |
"../" |
".." |
"." |
"foo/." |
"foo","." |
"foo/." |
"foo/." |
"" |
"" |
"" |
"foo/." |
"foo" |
"." |
"foo/.." |
"foo",".." |
"foo/.." |
"foo/.." |
"" |
"" |
"" |
"foo/.." |
"foo" |
".." |
"foo/./" |
"foo",".","." |
"foo/./" |
"foo/./" |
"" |
"" |
"" |
"foo/./" |
"foo/." |
"." |
"foo/./bar" |
"foo",".","bar" |
"foo/./bar" |
"foo/./bar" |
"" |
"" |
"" |
"foo/./bar" |
"foo/." |
"bar" |
"foo/.." |
"foo",".." |
"foo/.." |
"foo/.." |
"" |
"" |
"" |
"foo/.." |
"foo" |
".." |
"foo/../" |
"foo","..","." |
"foo/../" |
"foo/../" |
"" |
"" |
"" |
"foo/../" |
"foo/.." |
"." |
"foo/../bar" |
"foo","..","bar" |
"foo/../bar" |
"foo/../bar" |
"" |
"" |
"" |
"foo/../bar" |
"foo/.." |
"bar" |
"c:" |
"c:" |
"c:" |
"c:" |
"" |
"" |
"" |
"c:" |
"" |
"c:" |
"c:/" |
"c:","." |
"c:/" |
"c:/" |
"" |
"" |
"" |
"c:/" |
"c:" |
"." |
"c:foo" |
"c:foo" |
"c:foo" |
"c:foo" |
"" |
"" |
"" |
"c:foo" |
"" |
"c:foo" |
"c:/foo" |
"c:","foo" |
"c:/foo" |
"c:/foo" |
"" |
"" |
"" |
"c:/foo" |
"c:" |
"foo" |
"c:foo/" |
"c:foo","." |
"c:foo/" |
"c:foo/" |
"" |
"" |
"" |
"c:foo/" |
"c:foo" |
"." |
"c:/foo/" |
"c:","foo","." |
"c:/foo/" |
"c:/foo/" |
"" |
"" |
"" |
"c:/foo/" |
"c:/foo" |
"." |
"c:/foo/bar" |
"c:","foo","bar" |
"c:/foo/bar" |
"c:/foo/bar" |
"" |
"" |
"" |
"c:/foo/bar" |
"c:/foo" |
"bar" |
"prn:" |
"prn:" |
"prn:" |
"prn:" |
"" |
"" |
"" |
"prn:" |
"" |
"prn:" |
"c:\" |
"c:\" |
"c:\" |
"c:\" |
"" |
"" |
"" |
"c:\" |
"" |
"c:\" |
"c:foo" |
"c:foo" |
"c:foo" |
"c:foo" |
"" |
"" |
"" |
"c:foo" |
"" |
"c:foo" |
"c:\foo" |
"c:\foo" |
"c:\foo" |
"c:\foo" |
"" |
"" |
"" |
"c:\foo" |
"" |
"c:\foo" |
"c:foo\" |
"c:foo\" |
"c:foo\" |
"c:foo\" |
"" |
"" |
"" |
"c:foo\" |
"" |
"c:foo\" |
"c:\foo\" |
"c:\foo\" |
"c:\foo\" |
"c:\foo\" |
"" |
"" |
"" |
"c:\foo\" |
"" |
"c:\foo\" |
"c:\foo/" |
"c:\foo","." |
"c:\foo/" |
"c:\foo/" |
"" |
"" |
"" |
"c:\foo/" |
"c:\foo" |
"." |
"c:/foo\bar" |
"c:","foo\bar" |
"c:/foo\bar" |
"c:/foo\bar" |
"" |
"" |
"" |
"c:/foo\bar" |
"c:" |
"foo\bar" |
<fstream>
implementationsThe
change in semantics to functions taking const char*
arguments can break existing
code, but only on operating systems where implementations don't
implicitly accept native format
pathnames or operating systems that allow slashes in filenames. Thus on POSIX,
Windows, and OpenVMS, for example, there is no problem if the
implementation follows encouraged behavior.
For most of the Filesystem Library,
there is no existing code, so the issue preserving existing code that uses
slashes in filenames doesn't arise. New code simply must use basic_path
constructors with path_format_t
arguments of native
.
To preserve existing fstream code that uses slashes in filenames, an
implementation may wish to provide a mechanism such as a macro to control
selection of the old behavior.
Implementations are
already required by the TR front-matter to provide a mechanism such as a macro
to control selection of
the old behavior (useful to guarantee protection of existing code) or new behavior (useful
in new code, and code being ported from other systems) for headers. Because use of the rest
of the Filesystem Library is independent of use of the <fstream>
additions,
affected implementations are encouraged to allow disabling the <fstream>
additions
separately from other TR features.
An rejected alternative was to supply new fstream classes in namespace sys
, inheriting from the
current classes, overriding the constructors and opens taking pathname
arguments, and providing the additional overloads. In Lillehammer LWG members
indicated lack of support for this alternative, feeling that costs outweigh
benefits.
For member functions described as returning "const string_type
" or
"const external_string_type
", implementations are permitted to return
"const string_type&
" or "const external_string_type&
"
respectively.
This allows implementations to avoid unnecessary copies. Return-by-value is
specified as
const
to ensure programs won't break if moved to a return-by-reference
implementation.
For example, the Boost implementation keeps the internal representation of a pathname in the portable format, so string() returns by reference and is inlined:
const string_type & string() const { return m_path; }
Howard Hinnant comments: This may inhibit optimization if rvalue reference is accepted. Const-qualified return types can't be moved from. I'd rather see either the return type specified as
const string_type&
or string_type
.
Beman Dawes comments: I can't make up my mind. Removing the const will bite users, but not very often. OTOH, excessive copying is a real concern, and if move semantics can alleviate that, I'm all for it. What does the LWG think?
The Boost implementation has basic_path functions canonize() and normalize() which return cleaned up string representations of a pathname. They have been removed from the proposal as messy to specify and implement, not hugely useful, and possible to implement by users as non-member functions without any loss of functionality or efficiency. There was also a concern the proposal was getting a bit large.
These functions can be added later as convenience functions if the LWG so desires..
Boost has a set of predicate functions that determine if a filename is valid for a particular operating or system. These can be used as building blocks for functions that determine if an entire pathname is valid for a particular operating or file system.
Users can use these functions to ensure that pathnames are in fact portable to target operating or file systems, without having to actually test on the target systems.
These functions are not included in the proposal because of lack of time, and uncertainty as to their fit with the Standard Library. They can be added later if the LWG so desires.
There have been requests from two Boost users (Steve Hartmann, Thomas Matelich) for a function to return available disk space. For POSIX and Windows, this looks both useful and trivial to implement, but I'm reluctant to propose an untested operational function. My intent is to propose it later as an addition, assuming a trial implementation turns up no showstoppers.
This Filesystem Library is dedicated to my wife, Sonda, who provided the support necessary to see both a trial implementation and the proposal itself through to completion. She gave me the strength to continue after a difficult year of cancer treatment in the middle of it all.
Many people contributed technical comments, ideas, and suggestions to the Boost Filesystem Library. See http://www.boost.org/libs/filesystem/doc/index.htm#Acknowledgements.
Dietmar Kühl contributed the original Boost Filesystem Library directory_iterator design. Peter Dimov, Walter Landry, Rob Stewart, and Thomas Witt were particularly helpful in refining the library.
The create_directories, extension, basename, and replace_extension functions were developed by Vladimir Prus.
Howard Hinnant and John Maddock reviewed a draft of the proposal, and identified a number of mistakes or weaknesses, resulting in a more polished final document.
[ISO-POSIX] | ISO/IEC 9945:2003, IEEE Std 1003.1-2001, and The Open Group Base Specifications, Issue 6. Also known as The Single Unix® Specification, Version 3. Available from each of the organizations involved in its creation. For example, read online or download from www.unix.org/single_unix_specification/. The ISO JTC1/SC22/WG15 - POSIX homepage is www.open-std.org/jtc1/sc22/WG15/ |
[Abrahams] | Dave Abrahams, Error and Exception Handling, www.boost.org/more/error_handling.html |
© Copyright Beman Dawes, 2002-2005
Revised 2005-08-23