Distributed File System (DFS)
Distributed File System (DFS)
A Distributed File System (DFS) is a file system that supports sharing of files and resources in the form of persistent storage over a network. The first file servers were developed in the 1970s and Sun's Network File System (NFS) became the first widely used distributed file system after its introduction in 1985. Notable distributed file systems besides NFS are Andrew file system (AFS) and Common Internet File System (CIFS).
The Microsoft Distributed File System, or DFS, is a set of client and server services that allow a large enterprise to organize many distributed file shares into a distributed file system. DFS provides location transparency and redundancy to improve data availability in the face of failure or heavy load by allowing shares in multiple different locations to be logically grouped under one folder, or DFS root.
When users try to access a share that exists off the DFS root, the user is really looking at a DFS link and the DFS server transparently redirects them to the correct file server and share.
A DFS root can only exist on a Windows 2000 version that is part of the server family, or on Windows Server 2003. Windows 2000 can only host one DFS root per server, while Windows Server 2003 Enterprise and Datacenter Edition can host multiple DFS roots on the same server. (A Samba server can also host the root of a DFS.)
There are two ways of implementing DFS on Windows 2000 and Windows Server 2003:
- Standalone DFS roots allows for a DFS root that exists only on the local computer, and thus does not use Active Directory. A Standalone DFS can only be accessed on the computer which it is created. It doesn't offer any fault tolerance and cannot be linked to any other DFS.
- Domain-based DFS roots exist within Active Directory and can have their information distributed to other domain controllers within the domain &emdash; this provides fault tolerance to DFS. DFS roots that exist on a domain must be hosted on a domain controller. This is to ensure that links with the same target get all their information replicated over the network. The file and root information is replicated via the Microsoft File Replication Service (FRS).
Clients and servers
A file server provides file services to clients. A client interface for a file service is formed by a set of primitive file operations, such as creating a file, deleting a file, reading from a file, and writing to a file. The primary hardware component that a file server controls is a set of local secondary-storage devices on which files are stored, and from which they are retrieved according to the client requests.
Distribution
A DFS is a file system whose clients, servers, and storage devices are dispersed among the machines of a distributed system or intranet. Accordingly, service activity has to be carried out across the network, and instead of a single centralized data repository, the system has multiple and independent storage devices. The concrete configuration and implementation of a DFS may vary - in some configurations, servers run on dedicated machines while in others a machine can be both a server and a client. A DFS can be implemented as part of a distributed operating system, or alternatively, by a software layer whose task is to manage the communication between conventional operating systems and file systems. The distinctive features of a DFS are the multiplicity and autonomy of clients and servers in the system.
Transparency
Ideally, a DFS should appear to its users to be a conventional, centralized file system. The multiplicity and dispersion of its servers and storage devices should be made invisible. That is, the client interface used by programs should not distinguish between local and remote files. It is up to the DFS to locate the files and to arrange for the transport of the data.
Performance
The most important performance measurement of a DFS is the amount of time needed to satisfy service requests. In conventional systems, this time consists of a disk-access time and a small amount of CPU-processing time. In a DFS, however, a remote access has the additional overhead attributed to the distributed structure. This overhead includes the time to deliver the request to a server, as well as the time to get the response across the network back to the client. For each direction, in addition to the transfer of the information, there is the CPU overhead of running the communication protocol software. The performance of a DFS can be viewed as another dimension of the DFS' transparency. That is, the performance of an ideal DFS would be comparable to that of a conventional file system.
Concurrent File Updates
A DFS should provide for multiple client processes on multiple machines not just accessing but also updating the same files. Hence updates to the file from one client should not interfere with access and updates from other clients. Concurrency control or locking may be either built into the file system or be provided by an add-on protocol.
Distributed data store
A distributed data store is a network in which a user stores his or her information on a number of peer network nodes. The user also usually reciprocates and allows users to use his or her computer as a storage node as well. Information may or may not be accessible to other users depending on the design of the network.
Most of the peer to peer networks do not have distributed data stores in that the user's data is only available when their node is on the network. However, this distinction is somewhat blurred in a system such as BitTorrent, where it is possible for the originating node to go offline but the content to continue to be served. Still, this is only the case for individual files requested by the redistributors, as contrasted with a network such as Freenet where all computers are made available to serve all files.
Enhanced DFS management is one of the "branch office server management" features added to Windows Server 2003 in the R2 release in 2006.
Windows Server 2003 R2
An update of Windows Server 2003, officially called R2, was released to manufacturing on December 6, 2005. It is distributed as a second CD, with the first CD being Windows Server 2003 SP1. This release adds many optionally installable features for Windows Server 2003 including SP1.
New features
- Branch Office Server Management
- Centralised management tools for file and printers
- Enhanced Distributed File System (DFS) namespace management interface
- More-efficient WAN data replication with Remote Differential Compression
- Identity and Access Management
- Extranet Single Sign-On and identity federation
- Centralised administration of extranet application access
- Automated disabling of extranet access based on Active Directory account information
- User access logging
- Cross-platform web Single Sign-On and password synchronisation using Network Information Service (NIS)
- Storage Management
- File Server Resource Manager (storage utilisation reporting)
- Enhanced quota management
- File screening limits files types allowed
- Storage Manager for Storage Area Networks (SAN) (storage array configuration)
- Server Virtualisation
- A new licensing policy allows up to 4 virtual instances
- Utilities and SDK for UNIX-Based Applications add-on, giving a relatively full Unix development environment.
- Base Utilities
- SVR-5 Utilities
- Base SDK
- GNU SDK
- GNU Utilities
- UNIX Perl
- Visual Studio Debugger Add-in