Distributed Manta - Planen und Konfigurieren

Dieser Beitrag ist im November 2017 erschienen


Nachdem der letzte Versuch eine verteilte Manta-Installation durchzuführen leider nicht erfolgreich war (zweimal manta-init in zwei verschiedenen manta-Zonen führt zu Problemen), arbeite ich jetzt an einem neuen Versuch. Die richtige Verteilung der verschiedenen Manta-Dienste auf verschiedene Rechner und Datacenter soll sicherstellen, dass sowohl der Ausfall einzelner Nodes als auch der Ausfall eines ganzen Datacenters nicht dazu führt, dass die im Manta gespeicherten Daten nicht verfügbar sind. Wie im Operator's Guide beschrieben, ist das keine triviale Aufgabe. Ich hatte mich immer gefragt, wer diese Konfigurationen erstellt und welches Tool dafür benutzt wird. Die Antwort kam auf der Mailingliste manta-discuss. In der man-Page von manta-adm (sinnigerweise kann man diese nicht in der Manta-Zone sondern nur in der Global-Zone aufrufen) wird das Vorgehen verraten:

genconfig subcommand
    manta-adm genconfig "lab" | "coal"

    manta-adm genconfig [--directory=DIR] --from-file=FILE

    The manta-adm genconfig subcommand generates a JSON configuration file
    suitable for use with manta-adm update.  The images used for each
    service are the images configured in SAPI, which are generally the last
    images downloaded by manta-init(1), so this command is sometimes used
    as a shortcut for identifying the latest images that have been fetched
    for each service.

    When the first argument is "coal", the command produces a configuration
    suitable for a small VM-in-a-laptop deployment.  The configuration is
    always emitted to stdout.

    When the first argument is "lab", the command produces a configuration
    suitable for a larger single-server install.  The configuration is
    always emitted to stdout.

    The --from-file=FILE form can be used to generate a configuration
    suitable for a much larger, production-style deployment.  FILE is a
    JSON file in the format specified below that describes the parameters
    of the deployment, including the number of metadata shards and the set
    of availability zones, racks, and servers.  This form attempts to
    create a deployment that will survive failures of any component,
    server, rack, or availability zone as long as sufficient servers,
    racks, and availability zones are included in the input file.
    Availability zone and rack information can be omitted from the file, in
    which case the tool will generate a configuration ignoring rack-level
    and AZ-level considerations.  This tool uses a number of heuristics,
    and the output should be verified.

    By default, the generated configuration is emitted to stdout.  With the
    --directory option, the configuration will be written to files in the                      
    specified directory named by availability zone.  This option must be                       
    used if the servers in FILE span more than one availability zone.                          

    The input JSON file FILE should contain a single object with                               
    properties:                              

    nshards (positive integer)               
           the number of database shards to create, which is usually one                       
           more than the number of shards that are intended to store object                    
           metadata (in order to accommodate jobs and low-volume system                        
           metadata that's typically stored in shard 1)                                        

    servers (array of objects)               
           the list of servers available for deployment                                        

    Each element of servers is an object with properties:                                      

    type (string: either "metadata" or "storage")                                              
           identifies this server as a target for metadata services or                         
           storage services.  It's not strictly required that Manta                            
           services be partitioned in this way, but this tool requires that                    
           because most production deployments use two classes of hardware                     
           for these purposes.            

    uuid (string)
           the SDC compute node uuid for this server.  This must be unique
           within the entire region.
                                                                                            
    memory (positive integer)                                                                 
           gigabytes of memory available on this server.  This is currently                   
           only used for storage servers to determine the appropriate                         
           number of compute zones.                                                           
                                                                                            
    az (string)                                                                               
           (optional) availability zone.  If the value is omitted from any                    
           server, that server is placed into a default availablity zone.                     
                                                                                            
    rack (string)                                                                             
           (optional) arbitrary identifier for the rack this server is part                   
           of.  Racks often represent fault domains, so the tool uses this                    
           information to attempt to distribute services across racks.  If                    
           the value is omitted from any server, that server is placed into                   
           a default rack.                                                                    
                                                                                            
    See the Manta Operator's Guide for a more complete discussion of sizing                   
    and laying out Manta services.

Ein JSON-File, welches man benötigt, um manta-adm dazu zu bringen eine Konfiguration zu generieren, könnte z. B. so aussehen:

Mit den obigen Parametern aufgerufen generiert manta-adm dann folgende Ausgabe:

[root@fb94b510-640b-4357-b636-933deb0799f3 (de-gt-4:manta0) ~]# manta-adm genconfig --directory=/root/config --from-file=/root/dmanta.json
wrote config for "de-gt-1"                      
wrote config for "de-gt-3"                      
wrote config for "de-gt-4"                      

Summary of generated configuration:             

     SERVICE          SHARD          de-gt-1          de-gt-3          de-gt-4
     nameservice          -                1                1                1
     electric-moray       -                1                1                1
     storage              -                2                2                2
     authcache            -                1                                 1
     webapi               -                1                1                1
     loadbalancer         -                1                1                1
     jobsupervisor        -                1                                 1
     jobpuller            -                                 1                1
     medusa               -                1                1                
     ops                  -                                 1                
     madtom               -                1                                 
     marlin-dashboard     -                                                  1
     marlin               -              112               32              128
     postgres             1                1                1                1
     postgres             2                1                1                1
     postgres             3                1                1                1
     moray                1                1                1                1
     moray                2                1                1                1
     moray                3                1                1                1

warning: requested 3 shards with only 1 metadata server in at least one datacenter.  Multiple primary databases will wind up running on the same servers, and this configuration may not survive server failure.  This is not recommended.
[root@fb94b510-640b-4357-b636-933deb0799f3 (de-gt-4:manta0) ~]#

Wenn man im JSON-File die Racks nicht aufführt, meckert manta-adm, dass es nur ein default-Rack geben dürfe. Übrigens, eine Beispielkonfiguration für "de-gt-1", die manta-adm da generiert hat, sieht wie folgt aus (müsste man sonst irgendwie "per Hand" erfinden):

So far so good.