ceph – sort osds by utilisation

  1. Version 1 – keep it simple

    $ ceph osd df  | awk '{ print "osd."$1, "size: "$5, "usage: " $8 }' | sort -nk5
    

    OSDs can be listed twice – depends on the crushmap.

  2. Version 2 – json + python

    $ ceph osd df tree -f json | python sort_hdd_osds.py
    osd.28  utilization: 15.278888
    osd.15  utilization: 19.700484
    osd.58  utilization: 25.052757
    osd.31  utilization: 28.781335
    osd.22  utilization: 31.525527
    osd.2   utilization: 32.456151
    osd.47  utilization: 32.496669
    osd.39  utilization: 32.598765
    osd.17  utilization: 34.17247
    osd.40  utilization: 34.375297
    osd.56  utilization: 35.102418
    osd.48  utilization: 36.400253
    osd.50  utilization: 36.608321
    osd.52  utilization: 36.628858
    osd.38  utilization: 36.929235
    osd.13  utilization: 37.222498
    osd.30  utilization: 40.405145
    osd.59  utilization: 40.708111
    osd.62  utilization: 40.813985
    osd.43  utilization: 41.488432
    osd.53  utilization: 42.457611
    osd.49  utilization: 42.834021
    osd.23  utilization: 42.907104
    osd.18  utilization: 48.978743
    

    L22@sort_hdd_osds.py – specifies the bucket type which is used (not the device class!)

ceph – wrong osd id with lvm+filestore

Not sure why…but i’ve found a strange ceph-volume behavior with lvm and filestore.

ceph-volume lvm list shows the wrong osd id while the affected osd is online with a another id.

$ mount | grep ceph-2
/dev/mapper/vg00-datalv1 on /var/lib/ceph/osd/ceph-2 type xfs (rw,relatime,seclabel,attr2,inode64,noquota)
$ cat /var/lib/ceph/osd/ceph-2/whoami 
2
$ sudo ceph osd metadata osd.2 | egrep "id|objectstore"
    "id": 2,
    "osd_objectstore": "filestore",
$ sudo ceph-volume lvm list
[...]
====== osd.8 =======

  [data]    /dev/vg00/datalv1

      type                      data
      journal uuid              XqM6CP-embw-gIfs-UN2Q-gRDR-TVWP-y1q5Te
      osd id                    8
      cluster fsid              ed62dbfb-f0f7-4b13-ace0-4ccea0c4a6bf
      cluster name              ceph
      osd fsid                  38e7bfb3-ad57-4979-b8a9-3f875e6cb6f5
      encrypted                 0
      data uuid                 W3h12f-xg3y-ij1Z-F70h-yx2n-SyD9-ioNEC7
      cephx lockbox secret
      crush device class        None
      data device               /dev/vg00/datalv1
      vdo                       0
      journal device            /dev/vg00/journallv1

  [journal]    /dev/vg00/journallv1

      type                      journal
      journal uuid              XqM6CP-embw-gIfs-UN2Q-gRDR-TVWP-y1q5Te
      osd id                    8
      cluster fsid              ed62dbfb-f0f7-4b13-ace0-4ccea0c4a6bf
      cluster name              ceph
      osd fsid                  38e7bfb3-ad57-4979-b8a9-3f875e6cb6f5
      encrypted                 0
      data uuid                 W3h12f-xg3y-ij1Z-F70h-yx2n-SyD9-ioNEC7
      cephx lockbox secret
      crush device class        None
      data device               /dev/vg00/datalv1
      vdo                       0
      journal device            /dev/vg00/journallv1

And if you try to start the osd via ceph-volume lvm trigger with the “wrong” ID 8 it will…

$ sudo ceph-volume lvm trigger 8-38e7bfb3-ad57-4979-b8a9-3f875e6cb6f5
Running command: mount -t xfs -o rw,noatime,inode64 /dev/vg00/datalv1 /var/lib/ceph/osd/ceph-8
Running command: ln -snf /dev/vg00/journallv1 /var/lib/ceph/osd/ceph-8/journal
Running command: chown -R ceph:ceph /dev/dm-2
Running command: systemctl enable ceph-volume@lvm-8-38e7bfb3-ad57-4979-b8a9-3f875e6cb6f5
Running command: systemctl start ceph-osd@8
--> ceph-volume lvm activate successful for osd ID: 8

$ sudo cat /var/log/ceph/ceph-osd.8.log
2018-07-04 19:28:34.754576 7f346e67fd80  0 set uid:gid to 167:167 (ceph:ceph)
2018-07-04 19:28:34.754598 7f346e67fd80  0 ceph version 12.2.5 (cad919881333ac92274171586c827e01f554a70a) luminous (stable), process (unknown), pid 3755
2018-07-04 19:28:34.754872 7f346e67fd80 -1 OSD id 2 != my id 8

FAIL! Same with the correct ID 2…

[vagrant@ceph-osd2 ~]$ sudo ceph-volume lvm trigger 2-38e7bfb3-ad57-4979-b8a9-3f875e6cb6f5
-->  RuntimeError: could not find osd.2 with fsid 38e7bfb3-ad57-4979-b8a9-3f875e6cb6f5

To fix that problem we need to adjust the datatag: ceph.osd_id on the LVM device.

$ sudo lvs -o lv_tags vg00/datalv1
  LV Tags                                                                                                                                                                                                                                                                                                                                                                                                                                     
  ceph.cephx_lockbox_secret=,ceph.cluster_fsid=ed62dbfb-f0f7-4b13-ace0-4ccea0c4a6bf,ceph.cluster_name=ceph,ceph.crush_device_class=None,ceph.data_device=/dev/vg00/datalv1,ceph.data_uuid=W3h12f-xg3y-ij1Z-F70h-yx2n-SyD9-ioNEC7,ceph.encrypted=0,ceph.journal_device=/dev/vg00/journallv1,ceph.journal_uuid=XqM6CP-embw-gIfs-UN2Q-gRDR-TVWP-y1q5Te,ceph.osd_fsid=38e7bfb3-ad57-4979-b8a9-3f875e6cb6f5,ceph.osd_id=8,ceph.type=data,ceph.vdo=0
$ sudo lvs -o lv_tags vg00/journallv1
  LV Tags                                                                                                                                                                                                                                                                                                                                                                                                                                        
  ceph.cephx_lockbox_secret=,ceph.cluster_fsid=ed62dbfb-f0f7-4b13-ace0-4ccea0c4a6bf,ceph.cluster_name=ceph,ceph.crush_device_class=None,ceph.data_device=/dev/vg00/datalv1,ceph.data_uuid=W3h12f-xg3y-ij1Z-F70h-yx2n-SyD9-ioNEC7,ceph.encrypted=0,ceph.journal_device=/dev/vg00/journallv1,ceph.journal_uuid=XqM6CP-embw-gIfs-UN2Q-gRDR-TVWP-y1q5Te,ceph.osd_fsid=38e7bfb3-ad57-4979-b8a9-3f875e6cb6f5,ceph.osd_id=8,ceph.type=journal,ceph.vdo=0
  1. Remove the old datatag

    lvchange --deltag ceph.osd_id=8 vg00/datalv1
    lvchange --deltag ceph.osd_id=8 vg00/journallv1
  2. Add the correct datatag

    lvchange --addtag ceph.osd_id=2 vg00/datalv1
    lvchange --addtag ceph.osd_id=2 vg00/journallv1

And et voilà

$ sudo ceph-volume lvm trigger 2-38e7bfb3-ad57-4979-b8a9-3f875e6cb6f5
Running command: mount -t xfs -o rw,noatime,inode64 /dev/vg00/datalv1 /var/lib/ceph/osd/ceph-2
Running command: ln -snf /dev/vg00/journallv1 /var/lib/ceph/osd/ceph-2/journal
Running command: chown -R ceph:ceph /dev/dm-2
Running command: systemctl enable ceph-volume@lvm-2-38e7bfb3-ad57-4979-b8a9-3f875e6cb6f5
 stderr: Created symlink from /etc/systemd/system/multi-user.target.wants/ceph-volume@lvm-2-38e7bfb3-ad57-4979-b8a9-3f875e6cb6f5.service to /usr/lib/systemd/system/ceph-volume@.service.
Running command: systemctl start ceph-osd@2
--> ceph-volume lvm activate successful for osd ID: 2
$ sudo cat /var/log/ceph/ceph-osd.2.log
2018-07-04 19:40:04.075588 7fa9cbf6bd80  0 set uid:gid to 167:167 (ceph:ceph)                                                                                                                                                                                                                                                
2018-07-04 19:40:04.075608 7fa9cbf6bd80  0 ceph version 12.2.5 (cad919881333ac92274171586c827e01f554a70a) luminous (stable), process (unknown), pid 4165                                                                                                                                                                     
2018-07-04 19:40:04.080821 7fa9cbf6bd80  0 pidfile_write: ignore empty --pid-file                       
2018-07-04 19:40:04.109636 7fa9cbf6bd80  0 load: jerasure load: lrc load: isa                                                                                                                                                                                                                                                
2018-07-04 19:40:04.110273 7fa9cbf6bd80  0 filestore(/var/lib/ceph/osd/ceph-2) backend xfs (magic 0x58465342)                                                                                                                                                                                       
2018-07-04 19:40:04.121305 7fa9cbf6bd80  0 filestore(/var/lib/ceph/osd/ceph-2) start omap initiation
[...]    

ceph-ansible: minimal containerized deployment (docker)

tested with v3.0.26

group_vars/all.yml

---
monitor_interface: eth1
radosgw_interface: eth1
public_network: 10.20.30.0/24
cluster_network: 192.168.121.0/24
ceph_conf_overrides:
    osd:
        osd scrub during recovery: false
ceph_docker_image: "ceph/daemon"
ceph_docker_image_tag: latest
ceph_docker_registry: 10.20.30.1:5000
containerized_deployment: true

group_vars/osds.yml

---
crush_location: true
osd_crush_location: "\"root={{ ceph_crush_root }} rack={{ ceph_crush_rack }} host={{ ansible_hostname }}\""
osd_objectstore: bluestore
osd_scenario: non-collocated
devices:
- /dev/sdb
- /dev/sdc
- /dev/sdd
- /dev/sde
- /dev/sdf
- /dev/sdg
- /dev/sdh
dedicated_devices:
- /dev/nvme0n1
- /dev/nvme0n1
- /dev/nvme0n1
- /dev/nvme0n1
- /dev/nvme0n1
- /dev/nvme0n1
- /dev/nvme0n1

group_vars/mons.yml (optional)

---
openstack_config: true
openstack_glance_pool:
  name: images
  pg_num: "{{ osd_pool_default_pg_num }}"
  rule_name: ""
openstack_pools:
  - "{{ openstack_glance_pool }}"

SUSE Cloud – missing cinder key on computes – part2

I’ve found the root cause for the missing cinder key on the computes.

chef-client output – without any files:

[2017-11-30T09:37:33+01:00] INFO: Processing package[ceph-common] action install (nova::ceph line 50)
[2017-11-30T09:37:33+01:00] INFO: Ceph configuration file is missing; skipping the ceph setup for backend ceph-hdd
[2017-11-30T09:37:33+01:00] INFO: Ceph configuration file is missing; skipping the ceph setup for backend ceph-ssd

chef-client output – only with ceph.conf:

[2017-11-30T09:40:00+01:00] INFO: Processing package[ceph-common] action install (nova::ceph line 50)
[2017-11-30T09:40:00+01:00] INFO: Ceph user keyring wasn't provided for backend ceph-hdd
[2017-11-30T09:40:00+01:00] INFO: Ceph user keyring wasn't provided for backend ceph-ssd

Still not the right secret. The correct name should be “ceph crowbar-#uuid# name”

root@d98-f2-b3-9e-d6-30:~ # virsh secret-list
 UUID                                  Usage
--------------------------------------------------------------------------------
 5b7c1b36-9093-4a13-b14d-da8b8cbdd8a6  ceph client.cinder secret

chef-client output – now with the key and the ceph.conf:

[2017-11-30T09:51:16+01:00] INFO: Processing package[ceph-common] action install (nova::ceph line 50)
[2017-11-30T09:51:16+01:00] WARN: Cloning resource attributes for ruby_block[save nova key as libvirt secret] from prior resource (CHEF-3694)
[2017-11-30T09:51:16+01:00] WARN: Previous ruby_block[save nova key as libvirt secret]: /var/chef/cache/cookbooks/nova/recipes/ceph.rb:94:in `block in from_file'
[2017-11-30T09:51:16+01:00] WARN: Current  ruby_block[save nova key as libvirt secret]: /var/chef/cache/cookbooks/nova/recipes/ceph.rb:94:in `block in from_file'

Yeah. Finally!

root@d98-f2-b3-9e-d6-30:~ # virsh secret-list
 UUID                                  Usage
--------------------------------------------------------------------------------
 5b7c1b36-9093-4a13-b14d-da8b8cbdd8a6  ceph crowbar-5b7c1b36-9093-4a13-b14d-da8b8cbdd8a6 secret
 7003682d-80fe-4258-b2bb-e6c1b628aa5e  ceph crowbar-7003682d-80fe-4258-b2bb-e6c1b628aa5e secret

ovirt/rhev/rhv – custom branding (web ui)

Step 1 – create a custom folder in /etc/ovirt-engine/branding

[root@rev1 branding]# ll
total 0
lrwxrwxrwx. 1 [..] 00-ovirt.brand -> /usr/share/ovirt-engine/branding/ovirt.brand
lrwxrwxrwx. 1 [..] 50-rhev-2.brand -> /usr/share/ovirt-engine/branding/rhev-2.brand
drwxr-xr-x. 3 [..] 99-custom.brand

Step 2 – copy your layout and create a ‘branding.properties’

[root@rev1 branding]# ll 99-custom.brand/
total 8
-rw-r--r--. 1 [..] branding.properties
-rw-r--r--. 1 [..] common.css
drwxr-xr-x. 2 [..] images

Step 3 – branding.properties – don’t remove the version parameter

[root@rev1 branding]# cat 99-custom.brand/branding.properties 
#style sheets.
userportal_css=common.css
webadmin_css=common.css
welcome_css=common.css

version=2

Step 4 – my example for a common.css – we only want to replace the logo!

[root@rev1 branding]# cat 99-custom.brand/common.css 
/* LoginSectionView.ui.xml:
   app logo, positioned in the top right of login screen */
.obrand_loginPageLogoImage {
    background-image: url(images/logo.png);
    width: 137px;
    height: 44px;
    border: 0px;
    display: block;
}

Step 5 – restart ovirt-engine and check the sourcecode

# systemctl restart ovirt-engine
[...]
<link rel="stylesheet" type="text/css" href="/ovirt-engine/theme/00-ovirt.brand/welcome_style.css">
[...]
<link rel="stylesheet" type="text/css" href="/ovirt-engine/theme/99-custom.brand/common.css">
[...]

SUSE Cloud – missing cinder key on computes

2017-11-01 14:30:53.970 27835 ERROR nova.virt.libvirt.driver [instance: c5618826-98cb-4fd6-9d6f-b8899bd320b7] libvirtError: Secret not found: no secret with matching uuid '5b7c1b36-
9093-4a13-b14d-da8b8cbdd8a6'
2017-11-01 14:30:53.970 27835 ERROR nova.virt.libvirt.driver [instance: c5618826-98cb-4fd6-9d6f-b8899bd320b7] 
2017-11-01 14:30:53.971 27835 ERROR nova.virt.block_device [req-9f046c95-fecf-46e5-874d-43b42da1e63f 62169e96ed4b485aa2dfb2ca3235305c 05f20019f1c94952937a7f34087f5471 - - -] [instan
ce: c5618826-98cb-4fd6-9d6f-b8899bd320b7] Driver failed to attach volume 9f33b42f-79ba-472f-8e10-9525f186cde1 at /dev/vdb

Unless you find a key on the compute like (something with crowbar-$ID)

# virsh secret-list 
 UUID                                  Usage
--------------------------------------------------------------------------------
 5b7c1b36-9093-4a13-b14d-da8b8cbdd8a6  ceph crowbar-5b7c1b36-9093-4a13-b14d-da8b8cbdd8a6 secret

you can/have to fix it on your own:

#!/bin/bash

ID="5b7c1b36-9093-4a13-b14d-da8b8cbdd8a6"
# get cinder key from ceph cluster - ceph auth get-key client.cinder
CINDERKEY="AQA4cw1aa2tAAhAAxYl2l/lCaer3squRBdXBYg=="
FILE="<secret ephemeral='no' private='no'><uuid>$ID</uuid><usage type='ceph'><name>client.cinder secret</name></usage></secret>"
FILENAME="/tmp/secret.xml"

for host in 01 02 03 04 05; do
	dest="compute${host}"
	echo "Verifiy host $dest:"
	if ! ssh $dest virsh secret-get-value $ID; then
		echo "Create secret for cinder user."
		ssh $dest "echo \"$FILE\" > $FILENAME"
		ssh $dest virsh secret-define --file $FILENAME
		ssh $dest virsh secret-set-value --secret $ID --base64 $CINDERKEY
	fi
	echo "ok!"	
done

SUSE Openstack Cloud – debugging sleshammer

To get a login shell during the discovery and before the nfs is mounted:

Add the DISCOVERY_ROOT_PASSWORD parameter

root@admin:~ # crowbarctl proposal edit provisioner default
{
  "id": "provisioner-default",
  "description": "Created on Thu, 09 Nov 2017 15:43:20 +0100",
  "attributes": {
    "provisioner": {
[...]
      "discovery": {
        "append": "DISCOVERY_ROOT_PASSWORD=replace-with-your-password"
      }
[...]
    }
}
root@admin:~ # crowbarctl proposal commit provisioner default