SUSE Cloud – missing cinder key on computes

2017-11-01 14:30:53.970 27835 ERROR nova.virt.libvirt.driver [instance: c5618826-98cb-4fd6-9d6f-b8899bd320b7] libvirtError: Secret not found: no secret with matching uuid '5b7c1b36-
9093-4a13-b14d-da8b8cbdd8a6'
2017-11-01 14:30:53.970 27835 ERROR nova.virt.libvirt.driver [instance: c5618826-98cb-4fd6-9d6f-b8899bd320b7] 
2017-11-01 14:30:53.971 27835 ERROR nova.virt.block_device [req-9f046c95-fecf-46e5-874d-43b42da1e63f 62169e96ed4b485aa2dfb2ca3235305c 05f20019f1c94952937a7f34087f5471 - - -] [instan
ce: c5618826-98cb-4fd6-9d6f-b8899bd320b7] Driver failed to attach volume 9f33b42f-79ba-472f-8e10-9525f186cde1 at /dev/vdb

Unless you find a key on the compute like (something with crowbar-$ID)

# virsh secret-list 
 UUID                                  Usage
--------------------------------------------------------------------------------
 5b7c1b36-9093-4a13-b14d-da8b8cbdd8a6  ceph crowbar-5b7c1b36-9093-4a13-b14d-da8b8cbdd8a6 secret

you can/have to fix it on your own:

#!/bin/bash

ID="5b7c1b36-9093-4a13-b14d-da8b8cbdd8a6"
# get cinder key from ceph cluster - ceph auth get-key client.cinder
CINDERKEY="AQA4cw1aa2tAAhAAxYl2l/lCaer3squRBdXBYg=="
FILE="<secret ephemeral='no' private='no'><uuid>$ID</uuid><usage type='ceph'><name>client.cinder secret</name></usage></secret>"
FILENAME="/tmp/secret.xml"

for host in 01 02 03 04 05; do
	dest="compute${host}"
	echo "Verifiy host $dest:"
	if ! ssh $dest virsh secret-get-value $ID; then
		echo "Create secret for cinder user."
		ssh $dest "echo \"$FILE\" > $FILENAME"
		ssh $dest virsh secret-define --file $FILENAME
		ssh $dest virsh secret-set-value --secret $ID --base64 $CINDERKEY
	fi
	echo "ok!"	
done

SUSE Openstack Cloud – debugging sleshammer

To get a login shell during the discovery and before the nfs is mounted:

Add the DISCOVERY_ROOT_PASSWORD parameter

root@admin:~ # crowbarctl proposal edit provisioner default
{
  "id": "provisioner-default",
  "description": "Created on Thu, 09 Nov 2017 15:43:20 +0100",
  "attributes": {
    "provisioner": {
[...]
      "discovery": {
        "append": "DISCOVERY_ROOT_PASSWORD=replace-with-your-password"
      }
[...]
    }
}
root@admin:~ # crowbarctl proposal commit provisioner default

ceph metasearch – elasticsearch backend – part 2

requirements

  • ceph cluster (kraken release)
  • elasticsearch

The rgw syncer is only used/triggered in multisite configurations – so we need to setup a second zone for the metasearch.

environment / settings

export rgwhost="192.168.122.80"
export elastichost="192.168.122.71"
export realm="demo"
export zonegrp="zone-1"
export 1zone="zone1-a"
export 2zone="zone1-b" # used for metasearch
export sync_akey="$( cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 20 | head -n 1 )"
export sync_skey="$( cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 40 | head -n 1 )"
export user_akey="$( cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 20 | head -n 1 )"
export user_skey="$( cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 40 | head -n 1 )"

setup (see also part1)

create first zone
# radosgw-admin realm create --rgw-realm=${realm} --default
# radosgw-admin zonegroup create --rgw-realm=${realm} --rgw-zonegroup=${zonegrp} --endpoints=http://${rgwhost}:80 --master --default
# radosgw-admin zone create --rgw-realm=${realm} --rgw-zonegroup=${zonegrp} --rgw-zone=${1zone} --endpoints=http://${rgwhost}:80 --access-key=${sync_akey} --secret=${sync_skey} --master --default
# radosgw-admin user create --uid=sync --display-name="zone sync" --access-key=${sync_akey} --secret=${sync_skey} --system
# radosgw-admin period update --commit
# systemctl restart ceph-radosgw@rgw.${rgwhost}
create second zone
# radosgw-admin zone create --rgw-realm=${realm} --rgw-zonegroup=${zonegrp} --rgw-zone=${2zone} --access-key=${sync_akey} --secret=${sync_skey} --endpoints=http://${rgwhost}:81
# radosgw-admin zone modify --rgw-realm=${realm} --rgw-zonegroup=${zonegrp} --rgw-zone=${2zone} --tier-type=elasticsearch --tier-config=endpoint=http://${elastichost}:9200,num_replicas=1,num_shards=10
# radosgw-admin period update --commit

Restart the first radosgw and the start the second radosgw. For example:

# screen -dmS rgw2zone radosgw --keyring /etc/ceph/ceph.client.admin.keyring -f --rgw-zone=${2zone} --rgw-frontends="civetweb port=81"

Check elasticsearch for the new index:

# curl http://${elastichost}:9200/_cat/indices | grep rgw-${realm}
yellow open rgw-demo    z0UiKOOFQl682yILobYbMw 5 1 1 0 11.7kb 11.7kb

modify header/metadata

create a user

radosgw-admin user create --uid=rmichel --display-name="rmichel" --access-key=${user_akey} --secret=${user_skey}

upload some test data….

s3cmd is configured with the ${user_akey} + ${user_skey} and the ${rgwhost}:80 as the endpoint.

# s3cmd modify --add-header x-amz-meta-color:green s3://bucket1/admin.key
modify: 's3://bucket1/admin.key'
# s3cmd info s3://bucket1/admin.key
s3://bucket1/admin.key (object):
   File size: 63
   Last mod:  Thu, 27 Apr 2017 21:14:55 GMT
   MIME type: text/plain
   Storage:   STANDARD
   MD5 sum:   ee40e385a45c4855bd360cfbdbd48711
   SSE:       none
   policy:    <?xml version="1.0" encoding="UTF-8"?><ListBucketResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Name>bucket1</Name><Prefix></Prefix><Marker></Marker><MaxKeys>1000</MaxKeys><IsTruncated>false</IsTruncated><Contents><Key>admin.key</Key><LastModified>2017-04-27T21:14:55.494Z</LastModified><ETag>&quot;ee40e385a45c4855bd360cfbdbd48711&quot;</ETag><Size>63</Size><StorageClass>STANDARD</StorageClass><Owner><ID>rmichel</ID><DisplayName>rmichel</DisplayName></Owner></Contents></ListBucketResult>
   cors:      none
   ACL:       rmichel: FULL_CONTROL
   x-amz-meta-color: green
   x-amz-meta-s3cmd-attrs: uid:0/gname:root/uname:root/gid:0/mode:33152/mtime:1493326171/atime:1493326171/md5:ee40e385a45c4855bd360cfbdbd48711/ctime:1493326171

query elasticsearch

The radosgw creates a index with the name rgw-${realm} (ref ceph.git)

In my case the url is http://${elastichost}:9200/rgw-${realm}/

# curl http://192.168.122.71:9200/rgw-demo/_search?q=meta.custom.color=green | python -m json.tool
{
    "_shards": {
        "failed": 0,
        "successful": 5,
        "total": 5
    },
    "hits": {
        "hits": [
            {
                "_id": "d9b0c7a5-f9e5-4c6e-a0c2-48642840c98b.14125.1:admin.key:",
                "_index": "rgw-demo",
                "_score": 0.23691465,
                "_source": {
                    "bucket": "bucket1",
                    "instance": "",
                    "meta": {
                        "content_type": "text/plain",
                        "custom": {
                            "color": "green",
                            "s3cmd-attrs": "uid:0/gname:root/uname:root/gid:0/mode:33152/mtime:1493326171/atime:1493326171/md5:ee40e385a45c4855bd360cfbdbd48711/ctime:1493326171"
                        },
                        "etag": "ee40e385a45c4855bd360cfbdbd48711",
                        "mtime": "2017-04-27T21:14:55.483Z",
                        "size": 63,
                        "x-amz-copy-source": "/bucket1/admin.key",
                        "x-amz-date": "Thu, 27 Apr 2017 21:14:55 +0000",
                        "x-amz-metadata-directive": "REPLACE"
                    },
                    "name": "admin.key",
                    "owner": {
                        "display_name": "rmichel",
                        "id": "rmichel"
                    },
                    "permissions": [
                        "rmichel"
                    ]
                },
                "_type": "object"
            }
        ],
        "max_score": 0.23691465,
        "total": 1
    },
    "timed_out": false,
    "took": 102
}

ceph radosgw (set)lifecycle – AWS v4 is broken

First – s3cmd config
Setting signature_v2 = true is not enough! You have to set --signature-v2 as a parameter.

Second – ‘Prefix’ tag
You have specify a Prefix tag – and yes with a captial P!

without prefix tag
<LifecycleConfiguration>
    <Rule>
        <ID>ExampleRule</ID>
	<Status>Enabled</Status>
        <Expiration>
             <Days>1</Days>
        </Expiration>
    </Rule>
</LifecycleConfiguration>
[root@kraken ~]# s3cmd setlifecycle lc.xml s3://bucket1 --signature-v2
ERROR: S3 error: 403 (AccessDenied)
with closing prefix tag
<LifecycleConfiguration>
    <Rule>
        <ID>ExampleRule</ID>
	</Prefix>
	<Status>Enabled</Status>
        <Expiration>
             <Days>1</Days>
        </Expiration>
    </Rule>
</LifecycleConfiguration>
[root@kraken ~]# s3cmd setlifecycle lc.xml s3://bucket1 --signature-v2
ERROR: S3 error: 403 (AccessDenied)
with prefix tag – working!
<LifecycleConfiguration>
    <Rule>
        <ID>ExampleRule</ID>
	<Prefix></Prefix>
	<Status>Enabled</Status>
        <Expiration>
             <Days>1</Days>
        </Expiration>
    </Rule>
</LifecycleConfiguration>
[root@kraken ~]# s3cmd setlifecycle lc.xml s3://bucket1 --signature-v2
s3://bucket1/: Lifecycle Policy updated
version

ceph version 11.2.0 (f223e27eeb35991352ebc1f67423d4ebc252adb7)
s3cmd version 1.6.1

ceph metasearch – elasticsearch backend

Fetch zonegroup configuration (json struct)
# radosgw-admin zonegroup get > /tmp/zonegroup.json

change the tier_type to elasticsearch

Import the configuration
# radosgw-admin zonegroup set --infile /tmp/zonegroup.json
Fetch zone configuration (json struct)
# radosgw-admin zone get > /tmp/zone.json

Add the following parameter endpoint & {url} for the section tier_config

    "tier_config": [
        {
            "key": "endpoint",
            "val": "http:\/\/192.168.122.71:9200"
        }
    ],
Import the configuration
# radosgw-admin zone set --infile /tmp/zone.json

OR

# radosgw-admin zone modify --rgw-zonegroup={zonegroup-name} --rgw-zone={zone-name} --tier-config=endpoint={url}
Update & Commit
# radosgw-admin period update --commit

to be continued… part2

Fixing ceph partition uuid or OSD data dir is not mounted

OSD_UUID

4fbd7e29-9d25-41b8-afd0-062c0ceff05d

JOURNAL_UUID

45b0969e-9b03-4f30-b4c6-b4b80ceff106

To fix the partition uuid

sgdisk --info=##partnr## -t ##partnr##:##part-uuid## /dev/##disk##

eg.
sgdisk --info=1 -t 1:4fbd7e29-9d25-41b8-afd0-062c0ceff05d /dev/sda1

Ref: /lib/udev/rules.d/95-ceph-osd.rules

[notepad] ceph journal size/ssd speed

ceph journal size (doc)

osd journal size = {2 * (expected throughput * filestore max sync interval)}

The default for filestore max sync interval is 5 therefore for a 10Gbit network the “perfect” size would be

osd journal size = { 2 * ( 1280 * 5 ) } = 12.5 GB

ceph ssd speed (journal)

The optimum would be sum of all disk seq write speeds – 11 disks with ~110mb/s = ~1210mb/s – an Intel P3520 might would fit.

How many journals per ssd?

Oh thats easy.

Journals = (ssd seq write speed) / (hdd seq write speed)

Journals = 1350 / 115 = ~11

(For the Intel P3520 with 11 hdds)