c语言sscanf函数的用法是什么
246
2022-11-23
HDFS Federation
This guide provides an overview of the HDFS Federation feature and how to configure and manage the federated cluster.这篇文档包好了hdfs federation特点的概述和如何配置并且管理federation集群。Background(背景) HDFS has two main layers:HDFS有两种主要功能:• Namespaceo Consists of directories, files and blocks.o It supports all the namespace related file system operations such as create, delete, modify and list files and directories. HDFS的命名空间包含目录、文件和块。 命名空间支持对HDFS中的目录、文件和块做类似文件系统的创建、修改、删除、列表文件和目录等基本操作。• Block Storage Service, which has two parts:o Block Management (performed in the Namenode) Provides Datanode cluster membership by handling registrations, and periodic heart beats. Processes block reports and maintains location of blocks. Supports block related operations such as create, delete, modify and get block location. Manages replica placement, block replication for under replicated blocks, and deletes blocks that are over replicated.o Storage - is provided by Datanodes by storing blocks on the local file system and allowing read/write access.• 在namenode中的块的管理o 提供datanode集群的注册、心跳检测等功能。o 处理块的报告信息和维护块的位置信息。o 支持块相关的操作,如创建、删除、修改、获取块的位置信息。o 管理块的冗余信息、创建副本、删除多余的副本等。• 存储:datanode提供本地文件系统上块的存储、读写、访问等。The prior HDFS architecture allows only a single namespace for the entire cluster. In that configuration, a single Namenode manages the namespace. HDFS Federation addresses this limitation by adding support for multiple Namenodes/namespaces to HDFS.以前的HDFS框架整个集群只允许有一个namenode,一个namenode管理所有的命名空间,HDFS联邦通过增加多个namenode来打破这种限制。Multiple Namenodes/NamespacesIn order to scale the name service horizontally, federation uses multiple independent Namenodes/namespaces. The Namenodes are federated; the Namenodes are independent and do not require coordination with each other. The Datanodes are used as common storage for blocks by all the Namenodes. Each Datanode registers with all the Namenodes in the cluster. Datanodes send periodic heartbeats and block reports. They also handle commands from the Namenodes. 为了水平扩展名称服务,联邦使用多个独立的namenodes/namespaces。所有的namenodes是联邦的,因此,单个namenode是独立的,不需要和其它namenode协调合作。datanode作为统一的块存储设备被所有namenode节点使用。每一个datanode节点都在所有的namenode进行注册。datanode发送心跳信息、块报告到所有namenode,同时执行所有namenode发来的命令。Users may use ViewFs to create personalized namespace views. ViewFs is analogous to client side mount tables in some Unix/Linux systems.用户可以应用viewfs创建一个个性化的namespace,viewfs类似于在某些linux/unix系统中的客户端安装表。 Block PoolA Block Pool is a set of blocks that belong to a single namespace. Datanodes store blocks for all the block pools in the cluster. Each Block Pool is managed independently. This allows a namespace to generate Block IDs for new blocks without the need for coordination with the other namespaces. A Namenode failure does not prevent the Datanode from serving other Namenodes in the cluster. 一个块池就是属于一个namespace的一组块。datanodes存储集群中所有的块池,它独立于其它块池进行管理。这允许namespace在不与其它namespace交互的情况下生成块的ID,有故障的namenode不影响datanode继续为集群中的其它namenode服务。A Namespace and its block pool together are called Namespace Volume. It is a self-contained unit of management. When a Namenode/namespace is deleted, the corresponding block pool at the Datanodes is deleted. Each namespace volume is upgraded as a unit, during cluster upgrade.一个namespace和它的blockpool一起叫做namespace volume,这是一个自己的管理单位,当一个namenode被删除,那么在datanode上的相应的block pool也会被删除。在集群进行升级的时候,每一个namespace volume独立的进行升级。ClusterIDA ClusterID identifier is used to identify all the nodes in the cluster. When a Namenode is formatted, this identifier is either provided or auto generated. This ID should be used for formatting the other Namenodes into the cluster.增加一个新的ClusterID标识来在集群中所有的节点。当一个namenode被格式化的时候,这个标识被指定或自动生成,这个ID会用于格式化集群中的其它namenode。Key Benefits• Namespace Scalability - Federation adds namespace horizontal scaling. Large deployments or deployments using lot of small files benefit from namespace scaling by allowing more Namenodes to be added to the cluster.• Performance - File system throughput is not limited by a single Namenode. Adding more Namenodes to the cluster scales the file system read/write throughput.• Isolation - A single Namenode offers no isolation in a multi user environment. For example, an experimental application can overload the Namenode and slow down production critical applications. By using multiple Namenodes, different categories of applications and users can be isolated to different namespaces.namespace的可扩展性:HDFS的水平扩展,但是命名空间不能扩展,通过在集群中增加namenode来扩展namespace,以达到大规模部署或者解决有很多小文件的情况。 Performance(性能):在之前的框架中,单个namenode文件系统的吞吐量是有限制的,增加更多的namenode能增大文件系统读写操作的吞吐量。 Isolation(隔离):一个单一的namenode不能对多用户环境进行隔离,一个实验性的应用程序会加大namenode的负载,减慢关键的生产应用程序,在多个namenode情况下,不同类型的程序和用户可以通过不同的namespace来进行隔离。Federation ConfigurationFederation configuration is backward compatible and allows existing single Namenode configurations to work without any change. The new configuration is designed such that all the nodes in the cluster have the same configuration without the need for deploying different configurations based on the type of the node in the cluster.Federation adds a new NameServiceID abstraction. A Namenode and its corresponding secondary/backup/checkpointer nodes all belong to a NameServiceId. In order to support a single configuration file, the Namenode and secondary/backup/checkpointer configuration parameters are suffixed with the NameServiceID. 联邦的配置是向后兼容的,允许在不改变任何配置的情况下让当前运行的单节点环境转换成联邦环境。新的配置方案确保了在集群环境中的所有节点的配置文件都是相同的,没有必要因为节点的不同而配置不同的文件。 在联邦环境下引入了一个新的概念叫NameServiceID,namenode和secondary/backup/checkpointer都属于这个,为了支持单文件配置, Namenode和secondary/backup/checkpointer的配置参数都以NameServiceID为后缀加到同一个配置文件中。Configuration:Step 1: Add the dfs.nameservices parameter to your configuration and configure it with a list of comma separated NameServiceIDs. This will be used by the Datanodes to determine the Namenodes in the cluster.第一步:把dfs.federation.nameservices配置参数加到配置文件,配置以逗号分隔的所有NameServiceID,这将用于让datanode识别在集群中的所有namenode。Step 2: For each Namenode and Secondary Namenode/BackupNode/Checkpointer add the following configuration parameters suffixed with the corresponding NameServiceID into the common configuration file:第二步:对于每一个Namenode和Secondary Namenode/BackupNode/Checkpointer 增加以NameServiceID为后缀的下列配置项:Daemon Configuration ParameterNamenode dfs.namenode.rpc-address dfs.namenode.servicerpc-address dfs.namenode.dfs.namenode.dfs.namenode.keytab.file dfs.namenode.name.dir dfs.namenode.edits.dir dfs.namenode.checkpoint.dir dfs.namenode.checkpoint.edits.dirSecondary Namenode dfs.namenode.secondary.dfs.secondary.namenode.keytab.fileBackupNode dfs.namenode.backup.address dfs.secondary.namenode.keytab.fileHere is an example configuration with two Namenodes:下面是两个namenode的配置示例:
版权声明:本文内容由网络用户投稿,版权归原作者所有,本站不拥有其著作权,亦不承担相应法律责任。如果您发现本站中有涉嫌抄袭或描述失实的内容,请联系我们jiasou666@gmail.com 处理,核实后本网站将在24小时内删除侵权内容。
发表评论
暂时没有评论,来抢沙发吧~