Friday, September 20, 2013

Logical Volume Manager

LVM (Logical Volume Manager):
LVM manages the storage to have a structured overview of it.

/var/adm/ras/lvmcfg.log        lvm log file shows what lvm commands were used (alog -ot lvmcfg)
alog -ot lvmt                  shows lvm commands and libs

The LVM consists of:
    -high level commands: can be used by users, e.g.: mklv (this can call an intermediate level command)
    -intermediate level commands: these are used by high-level commands, e.g. lcreatelv (users should not use these)
    -LVM subroutine interface library: it contains routines used by commands, e.g. lvm_createlv
    -Logical Volume Device Driver (LVDD): manages and processes all I/O; it is called by jfs or lvm library routines
    -Disk Device Driver: It is called by LVDD
    -Adapter Device Driver: it provides an interface to the physical disk

This shows how the execution of a high level command goes through the different layers of LVM:


LOGICAL VOLUME

After you create a volume group, you can create logical volumes within that volume group. Logical partitions and logical volumes make up the logical view. Logical partitions map to and are identical in size to the physical partitions. A physical partition is the smallest unit of allocation of disk where the data is actually stored. A logical volume is a group of one or more logical partitions that can span multiple physical volumes. All the physical volumes it spans must be in the same volume group.

A logical volume consists of a sequence of one or more logical partititons. Each logical partition has at least one and a maximum of three corresponding physical partitions that can be located on different physical volumes.

When you first define a logical volume, the characteristics of its state (LV STATE) will be closed. It will become open when, for example, a file system has been created in the logical volume and mounted.
It is also possible that you might want to create a logical volume and put nothing on it. This is known as a raw logical volume. Databases frequently use raw devices

Logical Volume types:
    - log logical volume: used by jfs/jfs2
    - dump logical volume: used by system dump, to copy selected areas of kernel data when a unexpected syszem halt occurs
    - boot logical volume: contains the initial information required to start the system
    - paging logical volume: used by the virtual memory manager to swap out pages of memory

users and appl.-s will use these lvs:
    - raw logical volumes: these will be controlled by the appl. (it will nit use jfs/jfs2)
    - journaled filesystems:


Striped logical volumes:
Striping is a technique spreading the data in a logical volume across several physical volumes in such a way that the I/O capacity of the physical volumes can be used in parallel to access the data.


LVCB (Logical Volume Control Block)
First 512 byte of each logical volume in normal VGs (In big VGs it moved partially into the VGDA, and for scalable VGs completely.)(traditionally it was the fs boot block) The LVCB stores the attributes of the LV. Jfs does not access this area.
# getlvcb -AT <lvname>                                <--shows the LVCB of the lv

--------------------

LOGICAL VOLUME:     hd2                    VOLUME GROUP:   rootvg
LV IDENTIFIER:      0051f2ba00004c00000000f91d51e08b.5 PERMISSION:     read/write
VG STATE:           active/complete        LV STATE:       opened/syncd
TYPE:               jfs                    WRITE VERIFY:   off
MAX LPs:            512                    PP SIZE:        32 megabyte(s)
COPIES:             2                      SCHED POLICY:   parallel
LPs:                73                     PPs:            146
STALE PPs:          0                      BB POLICY:      relocatable
INTER-POLICY:       minimum                RELOCATABLE:    yes
INTRA-POLICY:       center                 UPPER BOUND:    32
MOUNT POINT:        /usr                   LABEL:          /usr
MIRROR WRITE CONSISTENCY: on/ACTIVE
EACH LP COPY ON A SEPARATE PV ?: yes
Serialize IO ?:     NO


inter-policy    inter-physical volume allocation policy, can be minimum or maximum
                minimum: to allocate pp's the minimum pv will be used (not spreading to all pv's tha data if possible)
                maximum: to spread the physical partitions of this logical volume over as many physical volumes as possible.

This illustration shows 2 physical volumes. One contains partition 1 and a copy of partition 2. The other contains partition 2 with a copy of partition 1. The formula for allocation is Maximum Inter-Disk Policy (Range=maximum) with a Single Logical Volume Copy per Disk (Strict=y).


each lp copy on separate pv    The strictness value. Current state of allocation, strict, nonstrict, or superstrict. A strict allocation states that no copies for a logical partition are allocated on the same physical volume. If the allocation does not follow the strict criteria, it is called nonstrict. A nonstrict allocation states that copies of a logical partition can share the same physical volume. A superstrict allocation states that no partition from one mirror copy may reside the same disk as another mirror copy. (mirror 2 and mirror 3 cannot be on the sam edisk)

(So inter-policy and strictness have effect together how many disks are used: spreading to maximum disks (1st lps) then mirroring them we need another bunch of disks; however spreading to minimum disks and mirroring, we need less disks.)


intra-policy    intra-physical volume allocation policy, it specifies what startegy should be used for choosing pp's on a pv.
                it can be: edge (outer edge), middle (outer middle), center, inner middle, inner edge


If you specify a region, but it gets full, further partitions are allocated from near as possible to far away.
The more i/o-s used, the pp's should be allocate to the outer edge.

mirror write consistency If turned on LVM keeps additional information to allow recovery of inconsistent mirrors.
                  Mirror write consistency recovery should be performed for most mirrored logical volumes
                  MWC is necessary to mirror lvs with parallel scheduling policies.

sched policy      how reads and writes are handled to mirrorred logical volumes
                  parallel (default): read from least busy disk, write to all copies concurrently (at the same time)
                  sequential: read from primary copy only (if not available then next copy). write sequential (one after another)
                  (1 book suggests sequential because it works with MWC)

Write verify      If turned on, all writes will be verified with a follow-up read. This will negatively impact performace but useful.

BB policy         Bad block relocation policy. (bad blocks are relocatable or not)

Relocatable       Indicates whether the partitions can be relocated if a reorganization of partition allocation takes place.

Upper Bound
       what is the maximum number of physical volumes a logical volume can use for allocation

------------------

# lslv -l pdwhdatlv

PV                COPIES        IN BAND       DISTRIBUTION
hdiskpower5       125:000:000   3%            000:004:000:076:045

Copies            shows information of each copies (separated by :) on the disks (125 first copy and no other mirrors are on the disk)

In Band           the percentage of pps on the disk which were allocated within the region specified by Intra-physical allocation policy

Distribution      how many pps are allocated in: outer edge, outer middle, center, inner middle, and inner edge (125=4+76+45)

------------------


lslv lvname       displays information about the logical volume
lslv -m lvname    displays the logical partitions (LP) and their corresponding physical partititons (PP)
lslv -l lvname    displays on which physical volumes is the lv resides
lslv -p <hdisk>   displays the logical volume allocation map for the disk (shows used, free, stale for each physical partition)
lslv -p <hdisk> <lv> displays the same as above, just the given lv's partitions will be showed by numbers

    Open          Indicates active if LV contains a file system   
    Closed        Indicates inactive if LV contains a file system   
    Syncd         Indicates that all copies are identical   
    Stale         Indicates that copies are not identical   


mklv -y newlv1 datavg 1    create logical volumes (mklv -y'testlv' -t'jfs' rootvg 100 <--creates jfs with 100 lp)
    -y newlv1     name of the lv
    datavg        in which vg the lv will reside
    1             how many logical partitions add to the lv

mklv -t jfs2log -y <lvname> <vgname> 1 <pvname> creates a jfs2log lv (after creation format it: logform -V jfs2 <loglvname>)

rmlv              removes a logical volume
rmlv -f loglv     removes without confirmation

mklvcopy bblv 2 hdisk2    make a 2nd copy (1LP=2PP) of bblv to hdisk2 (synchronization will be needed: syncvg -p hdisk2 hdisk3)
rmlvcopy bblv 1 hdisk3    leave 1 copy (1LP=1PP) only and remove those from hdisk3

getlvcb           display the LVCB (Logical Volume Control Block) of a logical volume
extendlv          increasing the size of a logical volume
cplv              copying a logical volume
chlv              changes the characteristic of a logical volume

migratelp testlv/1/2 hdisk5/123 migrates testlv's data from the 1st lp's second copy to hdisk5 on pp 123
                 (output of lspv -M hdiskx can be used:lvname:lpnumber:copy, this sequence is needed)
                 (if it is not mirrorred than easier this way: migratelp testlv/1 hdisk3)
                 (if it is mirrorres and we use the above commande, than 1st copy will be used: testlv/1/1...)

migratelp in for cycle:
for i in $(lslv -m p1db2lv | grep hdiskpower11 | tail -50 | cut -c 2-4); do migratelp p1db2lv/$i hdiskpower3; done

lresynclv        resync a logical volume (???maybe if mirrorred???

------------------

Creating a new log logical volume:


1. mklv -t jfs2log -y lvname vgname 1 pvname        <-- creates the log lv
2. logform -V jfs2 /dev/lvname
3. chfs -a log=/dev/lvname /fsname                  <--changes the log lv (it can be checked in /etc/filesystems)

------------------

Resynchronizing a logical volume:

1. root@aix16: / # lslv hd6 | grep IDENTIFIER
LV IDENTIFIER:      00c2a5b400004c0000000128f907d534.2

2. lresynclv -l 00c2a5b400004c0000000128f907d534.2

------------------

Striped lv extending problems:

extending is only possible by the stripe width (if it is 2, the extended lp should be 2,4,6...)
if lv can't be extended upper bound can cause this:

lslv P02ctmbackuplv | grep UPPER
UPPER BOUND:    2

It means that the lv can only be on 2 disks, but if on those 2 disks has no more space it can't be extebded to other disks.
upper bound should be changed: chlv -u 4 P02ctmbackuplv


After this extension should be possible
.
------------------

Unable to find lv in the define configuration database

1. synclvodm <vgname>         <-- rebuild the volume group descriptors on the physical volume. Enter:
2. rmlv <lvname>              <-- remove the unwanted logical volume.

------------------

Migrating PPs between disks:


checking the PPs of test1lv:
lslv -m test1lv
test1lv:/home/test1fs
LP    PP1  PV1               PP2  PV2               PP3  PV3
0001  0001 hdisk6
0002  0002 hdisk6
0003  0003 hdisk6
...
0057  0057 hdisk6
0058  0058 hdisk6
0059  0059 hdisk6

the command: migratelp test1lv/59 hdisk7
(it wil migrate LP #59 to hdisk7)

in a for cycle:
for i in $(lslv -m shadowlv | grep hdisk1 | tail -10 | cut -c 2-4); do
migratelp shadowlv/${i} hdisk0
done


------------------

Once had a problem with an lv and its mirror copies:

root@bb_lpar: / # lsvg -l bbvg
bbvg:
LV NAME             TYPE       LPs     PPs     PVs  LV STATE      MOUNT POINT
0516-1147 : Warning - logical volume bblv may be partially mirrored.
bblv                jfs2       16      20      3    closed/syncd  /bb


root@bb_lpar: / # mirrorvg bbvg
0516-1509 mklvcopy: VGDA corruption: physical partition info for this LV is invalid.
0516-842 mklvcopy: Unable to make logical partition copies for

        logical volume.
0516-1199 mirrorvg: Failed to create logical partition copies
        for logical volume bblv.
0516-1200 mirrorvg: Failed to mirror the volume group.


root@bb_lpar: / # lslv -l bblv
0516-1939 : PV identifier not found in VGDA.


root@bb_lpar: / # rmlvcopy bblv 1 hdisk2
0516-1939 lquerypv: PV identifier not found in VGDA.
0516-304 getlvodm: Unable to find device id 0000000000000000 in the Device

        Configuration Database.
0516-848 rmlvcopy: Failure on physical volume 0000000000000000, it may be missing
        or removed.



The partial mirrored lps caused a big mess in VGDA and LVM, so the solution was the removal of these lps with a low-level command: lreducelv

1. checking the problematic lps:
root@bb_lpar: / # lslv -m bblv
bblv:/bb
LP    PP1  PV1               PP2  PV2               PP3  PV3
0001  0008 hdisk2
0002  0009 hdisk2
0003  0010 hdisk2
0004  0011 hdisk2
0005  0012 hdisk2
0006  0013 hdisk2
0007  0014 hdisk2
0008  0015 hdisk2
0009  0008 hdisk3            0016 hdisk2
0010  0009 hdisk3            0017 hdisk2
0011  0010 hdisk3            0018 hdisk2
0012  0012 hdisk3            0019 hdisk2
0013  0001 hdisk2
0014  0002 hdisk2
0015  0003 hdisk2
0016  0004 hdisk2


2. creating a text file with these wrong lps which will be used by lreducelv:
1st column: PVID of the disk with wrong lps (lspv hdisk2: 00080e82dfab25bc)
2nd column: PP# of the wrong lps (lslv -m bblv: PP2 column)
3rd column: LP# of the wrong lps (lslv -m bblv: LP column)

root@bb_lpar: / # vi partial_mir.txt
00080e82dfab25bc 0016 0009
00080e82dfab25bc 0017 0010
00080e82dfab25bc 0018 0011
00080e82dfab25bc 0019 0012


3. removing the partial mirror copies:
lreducelv -l <LV ID> -s <NUMBER of LPs> <TEXT FILE>

LV ID: 00080e820000d900000001334c11e0de.1 (lslv bblv)
NUMBER of LPs: 4 (wc -l partial_mir.txt)
TEXT FILE: partial_mir.txt

root@bb_lpar: / # lreducelv -l 00080e820000d900000001334c11e0de.1 -s 4 partial_mir.txt

Now the lvm deallocates all PP's of your partially mirror.


4. After these, lslv -m will show correct output, but LVCB or VGDA could still show we have 2 copies
root@bb_lpar: /tmp/bb # odmget -q name=bblv CuAt | grep -p copies

CuAt:
        name = "bblv"
        attribute = "copies"
        value = "2"
        type = "R"
        generic = "DU"

(We can see this paragraph only if there is mirroring, otherwise there will be no output of odmget command)


root@bb_lpar: /tmp/bb # getlvcb -AT bblv
         AIX LVCB
         intrapolicy = m
         copies = 1

(odmget shows we have 2 copies and getlvcb shows we have only 1 copy.)

Probably it is safer if we update both with the correct value:
putlvodm -c <COPYNUM> <LV ID>
putlvcb -c <COPYNUM> <LV NAME>

COPYNUM: 1
LV ID: 00080e820000d900000001334c11e0de.1 (lslv bblv)

root@bb_lpar: /tmp/bb # putlvodm -c 1 00080e820000d900000001334c11e0de.1
root@bb_lpar: /tmp/bb # putlvcb -c 1 bblv

source of this solution: http://archive.rootvg.net/cgi-bin/anyboard.cgi/aix?cmd=get&cG=73337333&zu=37333733&v=2&gV=0&p=

------------------

No comments:

Post a Comment