Plasma GitLab Archive
Projects Blog Knowledge

Plasma Project:


Transcript: Deploying PlasmaFS

We assume here:

  • Plasma is already built and installed on the operator node.
  • PostgreSQL is configured and running on all designated namenodes.
The following example is for just one host, "office1", which will take the role of the namenode and the datanode. Having more than one host is very easy, and only means to add further names to the right files (as indicated below).

We assume now that you are logged in on the operator node as the user that will run the PlasmaFS daemons. Also, if there is more than one node, you will need ssh access to the other nodes.

Preparation: Copying clusterconfig

There is a directory clusterconfig in the installed Plasma tree. Locate it, and copy it to your working directory:

$ cp -R /opt/plasma/share/plasma/clusterconfig .
$ cd clusterconfig

Also, ensure that /opt/plasma/bin is in your PATH.

Creating the instance

Invent a name for the cluster. In this example, the cluster is called "pfs".

You also need a path where to install the software on the cluster nodes. This must not be the path where Plasma was originally installed, but a different one (i.e. not /opt/plasma in this example). We choose here /data/plasma. The data blocks will also be stored under this path, so it should go on your "big" partition.

The scripts assume that /data/plasma is writable by the user running the daemons.

Create the instance:

$ ./ pfs /data/plasma 1M
Creating new instance:
Name:      pfs
Prefix:    /data/plasma
Blocksize: 1048576
Template:  template

You can also select other blocksizes than 1M. This number should be a power of 2, and a multiple of 65536.

Configure the instance

There is now a new directory instances/pfs with a number of files:

$ ls instances/pfs
authnode.conf   datanode.hosts  namenode.hosts  password_pnobody
authnode.hosts  global.conf     nfsnode.conf    password_proot
datanode.conf   namenode.conf   nfsnode.hosts

These are all configuration files. You normally only need to change the *.hosts files:

  • namenode.hosts: Put here the namenode hosts
  • datanode.hosts: Put here the datanode hosts
  • authnode.hosts: Put here all hosts where you want to have access to PlasmaFS via the authentication daemon. These are normally the computers where you start jobs.
  • nfsnode.hosts: Put here all hosts where you want to run the NFS bridge (i.e. where you want to mount PlasmaFS directly)
In my example, I put "office1" into namenode.hosts, datanode.hosts, and authnode.hosts.

The directory also contains two passwords: password_pnobody, and password_proot. With a few exceptions you will never need to enter these passwords. However, the passwords are auto-generated, and if you want to change them, this is now the ideal time.

Deploy the instance

This step copies the runtime files to the cluster nodes (using scp):

$ ./ pfs
Deploying to office1:
authnode.conf                                100% 1061     1.0KB/s   00:00    
datanode.conf                                100% 1282     1.3KB/s   00:00    
global.conf                                  100%  180     0.2KB/s   00:00    
namenode.conf                                100% 1670     1.6KB/s   00:00    
nfsnode.conf                                 100% 1557     1.5KB/s   00:00    
authnode.hosts                               100%  196     0.2KB/s   00:00    
datanode.hosts                               100%  110     0.1KB/s   00:00    
namenode.hosts                               100%  110     0.1KB/s   00:00    
nfsnode.hosts                                100%   96     0.1KB/s   00:00                                     100% 1967     1.9KB/s   00:00                                     100% 1966     1.9KB/s   00:00                                   100% 1960     1.9KB/s   00:00                                     100% 1967     1.9KB/s   00:00    
password_proot                               100%   33     0.0KB/s   00:00    
password_pnobody                             100%   33     0.0KB/s   00:00    
namenode.sql                                 100% 4958     4.8KB/s   00:00    
plasmad                                      100% 6283KB   6.1MB/s   00:00    
nfs3d                                        100% 6257KB   6.1MB/s   00:00    
plasma_datanode_init                         100% 3217KB   3.1MB/s   00:00    
plasma_admin                                 100% 5237KB   5.1MB/s   00:01    
netplex-admin                                100% 4406KB   4.3MB/s   00:00    
plasma                                       100% 6235KB   6.1MB/s   00:00    

Initialize the database(s)

$ ./ pfs
Creating database plasma_pfs on office1:

Start the namenode

$ ./ start pfs
office1: ok

Initialize the datanodes

This step will create the data files storing the blocks. Here, we assume that all datanodes have the same size (10G). If you have nodes with different sizes, repeat the step node by node (replace "all" with the host name(s)).

$ ./ pfs 10G all
Using 10240 blocks
Testing namenode connectivity (office1:2730):
Checking office1:
Initializing on office1:
Initialized datanode directory /data/plasma/data with identity 66b0c4621902772bbd910b98ff0cff84
  starting datanode on office1:
  adding identity to namenode

Load the users and groups

In this example we want to have exactly the users and groups in PlasmaFS that already exist locally on the computer. Because of this, we can simply load /etc/passwd and /etc/group:

$ ./ pfs
Testing namenode connectivity:
Setting passwd and group

If you need different users and groups, add the switches -passwd <file> and -group <file> to point to other files (in the same format). Note that the files should not contain passwords!

Start the remaining daemons

$ ./ start pfs
office1: office1: Plasma datanode already running
office1: office1: Plasma namenode already running
office1: ok
NFS nodes:

Create .plasmafs

Log in as the user you want to use for accessing PlasmaFS (on the host where you want to work - note that this host must be listed in Create a file ~/.plasmafs:

plasmafs {
  cluster {
    clustername = "pfs";
    node_list = "/data/plasma/etc/namenode.hosts";
    port = 2730;

Check whether this worked:

$ plasma params
clustername = pfs
coordinator = office1
blocksize = 1048576
lock_timeout = 300
replication = 1
data_security_level = Authenticated
data_timeout = 60
$ plasma fsstat
Total:                       10240 blocks
Used:                            0 blocks
Transitional:                    0 blocks
Free:                        10240 blocks

Enabled nodes:   1
Alive nodes:     1

This web site is published by Informatikbüro Gerd Stolpmann
Powered by Caml