Gitweb: http://git.fedorahosted.org/git/gfs2-utils.git?p=gfs2-utils.git;a=commitdiff... Commit: 5ec8f18602297bf888ffae1957ef6023d74fae12 Parent: 0000000000000000000000000000000000000000 Author: Steven Whitehouse swhiteho@redhat.com AuthorDate: 2010-10-27 10:18 +0000 Committer: Steven Whitehouse swhiteho@redhat.com CommitterDate: 2010-10-27 10:18 +0000
annotated tag: 3.1.0 has been created at 5ec8f18602297bf888ffae1957ef6023d74fae12 (tag) tagging 97368b361627c1a89f7e241297541118018504c9 (commit)
Version 3.1.0
A. J. Lewis (23): o CFLAGS+= instead of CFLAGS= so you can add options from the cmdline o allow CFLAGS to be added to from the command line Out with the old, in with the new. o Remove a number of unused macros o Check for dups when looking at inodes for the first time o Started working on interactive UI Large batch of changes - mainly dealing with ExHash directory handling. Fixed number of times it's necessary to run fsck to completely clean fs o convert (f)printf to log_* in fs_dir.c o Count blocks used by inodes and update them if counted doesn't match ondisk Check that a device has been passed in (bz #149261) o adjust the gfs ondisk format number o Release buffers after use o Convert remaining (f)printf's to log_* o increment link count when dealing with bad '.' & '..' entries o Update to latest ondisk.h from the s.r.c. gfs kernel source o Fix the (d)inode_hash_insert() fxns (fixes bz #149706) o continue instead of breaking on errors in scan_inode_list() o Slight modification to fsck man page for new fsck o Fix for fenced portion of bz #155478 Update fsck in HEAD of CVS with changes made to RHEL4 and STABLE o Make sure the link counts of directories are properly incremented Initial commit of fsck for GFS2
Abhijith Das (80): fix for bz169087 - split fill_super_block() into read_super_block and fill_super_block. calling block_mounters between calls to the two sb functions bz127042 fix: kill gnbd_monitor when all uncached gnbds have been removed fix for bz178812. ccsd init script and daemon print errors now cman init script fix for bz 159783. dlm module is modprobed immediately after cman module. Previously, dlm module was loaded after cman_tool join. fixed. Obeys LSB standards for return values bz 190200: cleaned up gfs2_tool to compile. Removed some features (gfs2 doesn't support them anymore) and commented some out for later implementation. Needs to be reviewed sometime in the future. bz 190200 : man page changes for gfs2_tool changes fix for bz 190392 + pjc's return 1 if 'status' fails. bz#187279 bz191222 fix. When releasing a glock with GL_NOCACHE flag set, care was not taken to ensure that only one holder for the glock remained. This was corrupting the glock and preventing further access to the glock. FLOCKS use this GL_NOCACHE flag. See bugzilla for more information. This patch needs to go through a test cycle to ensure that it doesn't affect other code bz 191222 : removed debug printk Removed usage of IFLAG_INHERITDIRECTIO and IFLAG_INHERITJDATA flags, because they were removed from linux/iflags.h Edited Makefile to look more like other makefiles, dependent on objects rather than sources, etc. Removed reference to asm/page.h (for PAGE_SIZE) from util.c. Wasn't compiling on x86_64. Instead made PAGE_SIZE a #define Single init script to start up cluster: Covers loading of modules, starting ccsd, cman and fencing, and starting daemons. Replaces ccsd, cman and fenced init scripts Removed ccsd and fenced init scripts. Their functionality is replaced by the cman init script modprobing lock_dlm before starting gfs_controld, removed init.d make targets for ccs and fence gfs2 init script. Minor changes. Works just fine Initial commit of gfs2_jadd. Doesn't work fully. Needs to be tested with GFS2 filesystem. Removed reference to lock_gulm from the script. Works fine as is. gfs2 doesn't allow gfs2meta and gfs2 filesystems to run parallely. gfs2_jadd umounts gfs2 and mounts gfs2meta to do its thing. Removed test mode. little-endian to big-endian change on disk-hash. Continuing work on bz 195591. awk matching string for gfs and gfs2 was not right. Was causing the init scripts to go into a loop when both gfs and gfs2 fs were mounted fix for bz 203167 and bz 202984. stop_fence was commented out. Now we do stop_fence before doing a cman_tool leave. fix for bz 190204. gfs2_jadd uses the gfs2meta filesystem to add journals to an existing gfs2 fs memory violation Adding Josef's noquota mount option for GFS1 in RHEL5. Original bz 205285 bz 211418. Modified gfs2_tool and gfs2_jadd to use the new inode flags in fs.h instead of deprecated iflags.h bz 190196. gfs2_quota. Doesn't use sysfs anymore. Uses the gfs2meta filesystem instead. don't fail if unmounting configfs fails fix for bz 225199 - Same as GFS1 fix in RHEL4 (bz 210362). We don't run throug the entire gfs_quota sparse file to do a list operation anymore. We get the layout of the gfs_quota file on disk and only read quota information off the data blocks that are actually in use. Also added functionality to GFS_IOCTL_SUPER to provide the metadata of the hidden quota file. we don't use this file anymore. removing Need to write the user/group id to the sysfs quota refresh file instead of '1' Changes to fix broken code after Bob pulled out metafs mounting functionality from gfs2_quota into libgfs2. Fix for bz248177: We delete the old /etc/mtab entry and add a new one during remount. Any changes made to the mount options using remount are reflected in /etc/mtab now. fix for bz253172 - gfs2 init script should not unload any kernel modules fix for bz253016: userland fixes for gfs2 quota linked list man page changes for new gfs2_quota reset option fix bz 311591 - make lock_dlm the default lock protocol in mkfs.gfs and mkfs.gfs2 lon's patch removes 'Domain-0' check which was breaking xvm because cman starts before xend. patch also allows you to put NODENAME in /etc/sysconfig/cluster fix for bz333961 - adds support for -n and -f mount options gfs2_tool: remove 'gfs2_tool counters' as they aren't implemented anymore gfs-kernel: fix for bz 429343 gfs_glock_is_locked_by_me assertion gfs2_tool manpage: gfs2_tool counters doesn't exist anymore. gfs2_tool: Fix build warnings in misc.c bz 441636 gfs2_tool manpage: Updates to the manpage for bz441636 gfs2_tool: Fix build warnings in misc.c bz 441636 gfs-kernel: Bug 450209: Create gfs1-specific lock modules + minor fixes to build with 2.6.27 gfs-kernel: bug 450209 - addendum to previous patch. Removes extraneous lock_dlm_plock.c libgfs2: Bug 459630 - GFS2: changes needed to gfs2-utils due to gfs2meta fs changes in bz 457798 gfs-kernel: bz298931 - GFS unlinked inode metadata leak Revert "gfs-kernel: bz298931 - GFS unlinked inode metadata leak" gfs-kernel: GFS: madvise system call causes assertion gfs-kernel: bz457473 - GFS ignore the noatime and nodiratime mount options Revert "gfs-kernel: bz457473 - GFS ignore the noatime and nodiratime mount options" gfs-kernel and mount.gfs2: GFS ignore the noatime and nodiratime mount options gfs-kernel: bz 458765 - In linux-2.6.26 / 2.03.06, GFS1 can't create more than 4kb file gfs-kernel: bz466677 - fault in posix_lock_file() - "gfs_controld" responds to orphaned "plock_xop" request - suspected cause is patch for Bug 196318 gfs-kernel: Bug 466645 - reproduceable gfs (dlm) hanger with simple stresstest gfs2-utils: Bug 481762 - No longer able to mount GFS volume with noatime,noquota options gfs2_tool, libgfs2: bz487608 - GFS2: gfs2_tool unfreeze hangs gfs2_convert: Fix rgrp conversion to allow re-converts Revert "gfs2_convert: Fix rgrp conversion to allow re-converts" gfs2_convert: Fix rgrp conversion to allow re-converts gfs2_convert: Fix conversion of inodes with different heights on gfs1 and gfs2 gfs2_quota: Bug 536902 - quota file size not a multiple of struct gfs2_quota libgfs2: Bug 459630 - GFS2: changes needed to gfs2-utils due to gfs2meta fs changes in bz 457798 gfs2_convert: gfs2_convert should fix statfs file mount.gfs2: Better error reporting when mounting a gfs fs without enough journals gfs2_convert: gfs2_convert doesn't convert jdata files correctly gfs2_quota: fix uninitialized fiemap flags gfs2_quota: Fix gfs2_quota to handle boundary conditions libgfs2: fix build break caused by patch to bz 455300 gfs2_convert: Does not convert full gfs1 filesystems gfs2_convert: gfs2_convert segfaults when converting filesystems of blocksize 512 bytes gfs2_convert: gfs2_convert uses too much memory for jdata conversion gfs2_convert: Fix conversion of gfs1 CDPNs gfs2_convert: Doesn't convert indirectly-pointed extended attributes correctly gfs2_quota: Keep quota file length always sizeof(struct gfs2_quota) aligned gfs2_convert: gfs2_convert doesn't convert quota files gfs2 manual pages: gfs2_convert manpage and documentation updates gfs2_convert: gfs2_convert doesn't resume after interrupted conversion gfs2_convert: corrupts file system when directory has di_height 3
Adam Manthei (52): Addresses an issue that was seen w/ the APC Masterswitch. logins can fail o added support for APC 79XX diff -u -r1.3 fence_apc.pl Fix for bug #129521 fix a regex that ernoeously allowed the agent to access devices it shouldn't Fencing agents contributed by Lazar Obradovic laza@yu.net for I forgot to add Lazar Obradovic's fencing agents to the bin Makefile. From Lazar Obradovic: Add ribcl version 2.0 support to fence_rib. adding the Net::SSLeay version of the fence_rib agent. This should man page updates for the fence_rib and fence_ilo agents. move some functions around so that the script compiles fix typos that Erling found. Yet another method for fencing the IBM bladecenter. This method requires man page for fence_bladecenter bug 137035 -- add support for configurable ssl port locations to fence_ilo initial stab at init.d script remove /dev/misc/dlm-control if the major/minor number of the device node make the daemon more verbose in the script since the default logging is Jon changed the verbosity of ccsd. -v doesn't need to be on by default diff -u -p -r1.1 fence_brocade.pl fence_rib is deprecated by fence_ilo diff -u -p -r1.3 Makefile New man page now reads: mention fence_bladecenter in man page for fence_xcat o makes usage of ccs configurable Add Magma to the list of services that are ignore on shutdown. move the ordering of the scripts. lock_gulmd needs to be called before diff -u -p -r1.1 Makefile syncing w/ RHEL4 branch pid file is now in /var/run/cluster/ccsd.pid, not /var/run/sistina/ccsd.pid on shutdown, use gulm_tool to determine node liveliness rather than o cleaned up init.d script to remove excessive `cman_tool join` and having init.d/cman load the dlm module makes lon happy Apply Derek Anderson's patch to resolve bug 144514: Initial cut of drac fencing agent used at Oracle World. o cleaned up some man of the man pages don't start cman if <gulm> is defined in /etc/cluster/cluster.conf start gulm when --use_ccs is used only if <gulm> is defined in /etc/cluster/cluster.conf o fixed broken regex regex fix for gulm and cman check of /etc/cluster/cluster.conf partial fix to cman init script to address bug #147828. o added "-w" option to cman_tool join (bz #147828) remove the in_cluster() checking in stop. should no longer be needed uses -t option to cman_tool to time out wait operations Changes to the init scripts. BZ's 153739 and 153741. Support for Dell Remot Access Card III/XT o man page for fence_drac add fence_drac to default build o make init.d/fenced exit with WARNING instead of FAILED when using fix for bz 161352 Add support for Dell PowerEdge 1855 to fence_drac (bz 150563). When using
Alasdair G. Kergon (6): Initial checkin. initial checkin test checkin test commit test checkin test checkin
Andrew Beekhof (2): dlm_controld: add pacemaker support dlm_controld: pacemaker build
Andrew Price (27): [[BUILD] Warn and continue if CONFIG_KERNELVERSION is not found [GFS2] gfs2_fsck: Fix operation on 'ptr' may be undefined warnings [GFS2] Remove unrequired header file [GFS2] gfs2_edit: Remove duplicate linux_endian.h [GFS2] libgfs2: Build with -fPIC rgmanger: remove check on cluster.conf from rgmanager init script fence: simplify init script fence: port scsi agent to use ccs_tool query and drop XML::LibXML requirement askant: Import askant into tree askant: Fix linking with libgfs2 askant: prog_name is no longer required gfs2-utils: Remove 'die' calls from check_for_gfs2 gfs2-utils: Remove 'die' calls from set_sysfs libgfs2: Move get_list out of libgfs2 gfs2-utils: Remove 'die' calls from mount_gfs2_meta libgfs2: Remove 'die' calls from __get_sysfs libgfs2: Clean up mp2fsname2 libgfs2: Remove 'die' from mp2fsname and find_debugfs_mount libgfs2: Remove die from compute_heightsize libgfs2: Remove die from fix_device_geometry gfs2-utils: Clean up leftover prog_name globals fsck.gfs2: Remove compute_height gfs2-utils: Remove askant, contrib libgfs2: Fix 'dubious one-bit signed bitfield' sparse errors gfs2_quota: Fix sparse error libgfs2: Fix "Value stored is never read" warnings fsck.gfs2: Make block_mounters static
Benjamin Marzinski (78): The GNBD clients and servers now distinguish eachother via nodename, instead fixed obvious error.. If the gnbd_ctl device wasn't present, I was trying devfs was set to use /dev/gnbd/<minor>, but gnbd_import sets the gnbds cluster missing "_" in gnbd_monitor... caused gnbd_import to fail with uncached devices updated gnbd kernel patch to include recent changes initialize the polls array. This will make sure that gnbd_monitor realizes Changed the /dev/gnbd_ctl node I make from major 11 to major 10, which is what Fixed some annoying bugs. Previously, when you unimported a gnbd (with added -fPIC to the makefiles, so the cman.so, gulm.so, etc.. would compile Fixed some login structure size issues, so that gnbd clients and servers can Removed the dependency on "all" from the "uninstall" target link to the pthread library because magma needs it now. Bunch of gnbd fixes. The big one is that gnbd_monitor works like it's supposed Oops. Forgot to update the man pages. updated Makefile so that it make distclean works Forgot to update kernel stuff. there was a problem with running two gnbd_clusterd processes at the same Updated patches to 2.6.8.1 even though you now fence nodes by their cluster name, not their IP address, fixed printing issue for 6.1 Fixed gnbd_import and gnbd_export so that -r with no devices returns an Updated gnbd-kernel patches for 2.6.9 removed unneeded variable added the stack overflow fix to head (rbz139863) gfs_glock_nq_m(), nq_m_sync(), oops. switched long variable to unsigned long so that the gnbd device size Fixed gnbd_export remove error message for bug 143131. Now it will fixed journal corruption mentioned in bug 146672. If the latest entry written Fixed 146672. While it is still possible to see this bug, the problem that I Syncing head with RHEL4 branch. added man page for gnbd_serv Modified gnbd so that it can be used with multipath easier. When you export fixed bug that caused gnbd_monitor to only successfully monitor a device until Fix for bz #155597. GFS used to be able to write over a portion of the log This fixes bz # 156635. A variable that I allocated statically in my fix for Heres my fix of a fix of a fix. I wasn't initializing a list that needed to Adding Fabio Massimo Di Nitto's patch to keep up with changes in the kernel When you copy an suid root file to gfs, you start a transaction on the fixed acl code so that acls are displayed when enabled, and not displayed when switched 'deamon' to 'daemon' in uninstall rule. add watchdog.o to the rgmanager object files, to fix undefined reference to gfs2 was unable to truncate files if they were opened with write permissions, Update gfs2_tool to make use of sysfs instead of ioctls where appropriate. Fix for bz 142849 This fixes a problem from bz #173697. gfs_fsck crashed on many types of initial commit of csnap kernel code with a useful build structure. The Reorganize this directory to match the rest of the cluster tree. This is the gnbd code that is required for getting multipathing to work on fixed gnbd so that it compiles with the upstream kernel. Fixing the get_uid code to make it easier to integrate with multipath If it took too long for an uncached gnbd recvd process to stop after the server Make GNBD work with cman. updated man pages with UID information more Makefile fixing changed from MODULE_PARM to module_param There. that should work better libsysfs is deprecated. Stop using it. fixing dm-multipath support for GNBD pull devfs stuff out of gnbd. setting multiple locations for gnbd_get_uid to check for scsi_id, and updating file log.c was initially added on branch RHEL4. file log.h was initially added on branch RHEL4. Fix for bugzilla #207599. The individual gserv processes inherit the atexit Change the way gnbd notifies multipathd about device changes, to deal with the Make gnbd work with cman correctly. This sort of roughly falls under the heading Really gross hack!!! This is a bugfix for bz #211923. fix for bz215095 & 215099. make it so that the -c and -[u|U] flags are mutually exclusive. GNBD was hanging with the cfq scheduler, so I changed the default scheduler for Get GNBD compiling with the latest upstream kernel. GNBD doesn't need to flush the cache after it looses connection with the server. Fix for bz #426291. gfs_glock_dq was traversing the gl_holders list without The gnbd kernel module on 64 bit architectures didn't handle ioctls from 32 bit gnbd-kernel: Fix receiver race [gnbd-kernel] bz 449812: disallow sending requests after a send has failed. [gnbd-kernel] bz 442606: Switch gnbd to use deadline scheduler by default. gfs-kernel: workaround for potential deadlock. Prefault user pages libgfs2: mount device for metafs
Bob Peterson (202): Resolves: bz 435917: GFS2: mkfs.gfs2 default lock protocol Resolves: bz 421761: 'gfs_tool lockdump' wrongly says 'unknown Resolves: bz 431945: GFS: gfs-kernel should use device major:minor Merge branch 'master' of ssh://sources.redhat.com/git/cluster into master.bz431945 Update to prior commit for bz431945: I forgot that STABLE2 Resolves: bz 436383: GFS filesystem size inconsistent Fix savemeta so it saves gfs-1 rg information properly Fix gfs2_edit print options (-p) to work properly for gfs-1 gfs2_edit was not recalculating the max block size after it figured Fix some compiler warnings in gfs2_edit bz440896/440897 GFS: gfs_fsck should repair gfs_grow corruption bz425421: gfs mount attempt hangs if no more journals available bz438762: gfs_tool: Cannot allocate memory bz295301: Need man page for gfs_edit Replace put_inode with drop_inode bz 446085: Back-port faster bitfit algorithm from gfs2 for better Fix gfs2_edit bugs with non-4K block sizes Make gfs2_edit more friendly to automated testing. Updates to gfs2_edit man page for new option. Allow keywords in block number input Ability to specify starting block or structure with -s Fix compiler warning. Added an optional block-size to mkfs.gfs2 Fix build warnings in gfs2-utils. Fix another compiler warning for 32-bit arch. Fix build warnings from libgfs Fix gfs_debug build warning Ignoring gets return value in gfs_mkfs Fix gfs_tool build warnings Fix gfs_fsck build warnings Fix 32-bit warning in super.c. 452004: gfs: BUG: unable to handle kernel paging request. savemeta was not saving gfs1 journals properly. gfs2_fsck fails: Unable to read in jindex inode. Print log header flags for gfs journals. Speed up userspace bitmap manipulation code. gfs_fsck crosswrite for block number sanity checking Fix some bad references to gfs_tool and gfs_fsck Deleted unused function print_map Shrink memory 1: eliminate b_size from pseudo-buffer-heads Shrink memory 2: get rid of 3 huge in-core bitmaps Shrink memory 3: smaller link counts in inode_info Better error reporting in gfs2_fsck RGRepair: Account for RG blocks inside journals gfs2_fsck dupl. blocks between EA and data gfs2_edit: Ability to enter "journalX" in block number. gfs2_edit: was parsing out gfs1 log descriptors improperly gfs2_edit: Improved gfs journal dumps mkfs.gfs2: should have an optional fs size parm GFS2: Make gfs2_fsck accept UNLINKED metadata blocks Changes needed to stay compatible with libvolume_id. Changes needed to stay current with libvolume_id. GFS2: sync buffers to disk when rewriting superblock GFS2: gfs2_fsck: fix segfault while running special block lists. GFS: gfs_fsck invalid response to question changes the question gfs-kmod: GFS corruption after forced withdraw GFS2: gfs2_edit savemeta doesn't work with GFS Fix many bugs with gfs2_convert. Use jbsize for height computations on journaled files. mkfs.gfs2 hangs with many journals Grab hold of journal-turned-RG buffers so they're not freed. Remove splice_read file op for jdata files. Make gfs2_freedi delete indirect blocks with height >= 2 gfs: improve gfs_fsck rindex repair code Non-default block size confuses gfs2_grow Remove bogus TODO file from gfs2_fsck. GFS2: Add human readable output to gfs2_tool df Replace deprecated do_seek, do_read, do_write functions Get rid of build warning in gfs2_convert. Add Filesystem UUID to GFS2 utils. GFS2: make gfs2_fsck conform to fsck(8) exit codes GFS2: gfs2_edit fixes for 5.4 gfs2_tool df segfault on non-4K block size gfs2_grow man page references removed -r option gfs2_convert results in GFS2 File System Corruption gfs2_edit: Display pointer numbers and use color changes to GFS2: gfs2_edit savemeta not saving per_node quota files Fix gfs2_fsck segfault GFS2: gfs2_fsck should fix journal sequence number problems mount failure after gfs2_edit restoremeta of GFS file system gfs2_edit savemeta needs to save freemeta blocks gfs2_edit: Fix indirect block scrolling Correction to an earlier commit. Buffers were being updated Removed check for incorrect height GFS2: gfs2_edit savemeta wasn't saving indirect eattribute blocks GFS2: gfs2_edit savemeta wasn't saving ea sub-blocks GFS2: fsck.gfs2 sometimes needs to be run twice fsck.gfs2 writing bitmap when -n specified Fixed compiler warnings and errors that crept in. GFS2: gfs2_convert, parameter not understood on ppc /sbin/mount.gfs2: can't find /proc/mounts entry for directory / Message printed to stderr instead of stdout fsck.gfs2 segfaults while fixing 'EA leaf block type' problem. gfs2_edit produces unaligned access Fix more man page references to gfs2_fsck "fsck.gfs2: invalid option -- a" on boot when mounting gfs2 root GFS2: gfs2_fsck segfault in rindex repair code GFS2: fsck.gfs2 sometimes needs to be run twice The mount -o remount option failed due to new relatime option Allow gfs2_edit printsavedmeta to print destination size and type Make gfs2_edit -p <block> blockalloc work for gfs1 file systems Allow gfs2_edit to display and print gfs1 rgrps gfs2_edit: Indirect pointers missing from list when paging up and down gfs2_edit: Add missing superblock fields for gfs1 file systems gfs2_edit: Fix rindex read function for gfs1 file systems GFS2: gfs2_edit prints wrong directory entry type for gfs1 gfs2_edit -p block# shows wrong height/offset on gfs1 and segfaults on gfs2 GFS2 filesystem inconsistent after xfstests test suite run fsck.gfs2 unable to fix some rindex corruption for block size < 4K GFS2: fsck.gfs2 should fix the system statfs file GFS2: gfs2_edit savemeta bugs Remove nvbuf_list and use fewer buffers Eliminate bad_block linked block list Simplify bitmap/block list structures Streamline the bitmap code by always using 4-bit size per block Misc blocklist optimizations Separate eattr_block list from the rest for efficiency gfs2: remove update_flags everywhere fsck.gfs2: give comfort when processing lots of data blocks fsck.gfs2: make query() count errors_found, errors_fixed Attach buffers to rgrp_list structs Make struct_out functions operate on bh's Attach bh's to inodes gfs2: Remove buf_lists fsck.gfs2: Verify rgrps free space against bitmap libgfs2: Consistent naming for blockmap functions Move duplicate code from libgfs2 to fsck.gfs2 libgfs2, fsck.gfs2: simplify block_query code gfs2: libgfs2 and fsck.gfs2 cleanups libgfs2: fs_bits speed up bitmap operations libgfs2: gfs2_log reform fsck.gfs2: convert dup_list to a rbtree fsck.gfs2: convert dir_info list to rbtree fsck.gfs2: convert inode hash to rbtree fsck.gfs2: pass1 should use gfs2_special_add not _set libgfs2: Remove unneeded sdp parameter in gfs2_block_set libgfs2: dir_split_leaf needs to zero out the new leaf libgfs2: dir_split_leaf needs to check for allocation failure libgfs2: Set block range based on rgrps, not device size fsck.gfs2: should use the libgfs2 is_system_directory fsck.gfs2: Journal replay should report what it's doing fsck.gfs2: fix directories that have odd number of pointers. libgfs2: Get rid of useless constants fsck.gfs2: link.c should log why it's making a change for debugging fsck.gfs2: Enforce consistent behavior in directory processing fsck.gfs2: enforce consistency between bitmap and blockmap fsck.gfs2: metawalk needs to check for no valid leaf blocks fsck.gfs2: metawalk was not checking many directories fsck.gfs2: separate check_data function in check_metatree lost+found link count and connections were not properly managed fsck.gfs2: reprocess lost+found and other inode metadata when blocks are added Misc cleanups fsck.gfs2: Check for massive amounts of pointer corruption fsck.gfs2: use gfs2_meta_inval vs. gfs2_inval_inode Eliminate unnecessary block_list from gfs2_edit fsck.gfs2: rename gfs2_meta_other to gfs2_meta_rgrp. Create a standard metadata delete interface fsck.gfs2: cleanup: refactor pass3 fsck.gfs2: Make pass1 undo its work for unrecoverable inodes fsck.gfs2: Overhaul duplicate reference processing fsck.gfs2: invalidate invalid mode inodes fsck.gfs2: Force intermediate lost+found inode updates fsck.gfs2: Free metadata list memory we don't need fsck.gfs2: Don't add extended attrib blocks to list twice fsck.gfs2: small parameter passing optimization fsck.gfs2: Free, don't invalidate, dinodes with bad depth Misc cleanups fsck.gfs2: If journal replay fails, give option to reinitialize journal Fix white space errors fsck.gfs2 fails on root fs: Device X is busy. gfs2_edit savemeta: Don't release indirect buffers too soon fsck.gfs2: Use fsck.ext3's method of dealing with root mounts GFS2: libgfs2: build_rgrps was not attaching bh's properly gfs2: fix regressions from performance fixes gfs2: GFS2 utilities should make use of exported device topology cman: gfs_controld dm suspend hangs withdrawn GFS file system GFS2: fsck.gfs2 segfault - osi_tree "each_safe" patch gfs2_fsck segfault when statfs system file is missing gfs2_edit restoremeta should not return 0 on failure fsck.gfs2: unaligned access on ia64 GFS2: libgfs2 bitfit algorithm using wrong shift point Make gfs2_edit show bit-to-block translation when viewing bitmaps gfs2-utils: mkfs can't fsync device with 32MB RGs fsck.gfs2 deletes directories if they get too big fsck.gfs2 segfaults if journals are missing Updating /proc/mounts and /etc/mtab with mount args for GFS2 fs Reported UUID from 'gfs2_edit -p sb' should be lower-case fsck.gfs2 truncates directories with more than 100,000 entries GFS2: fsck.gfs2 seems to process large files twice gfs2_edit: better printing of directory leaf information gfs2_edit: print hex numbers in lower-case gfs2_edit: negative block numbers don't jump a negative amount gfs2_edit: tiny (stuffed) files had user data saved with savemeta gfs2_edit: give meaningful feedback for savemeta and restoremeta gfs2_edit: Fix memory leak in savemeta option gfs2_edit: Split extended display functions into extended.c gfs2_edit: Move more functions to extended.c gfs2_edit: Extend individual field printing/editing gfs2_edit: fix page down on rindex gfs2_edit: print field names in right column gfs2_edit: display block allocation on rgrps and bitmaps Fix extended.c and extended.h for autotools
Chris Feist (107): Removed old includes from the kernel. Added an include for ${incdir} in case inc dir Modified the 'make all' command to install the files into the cluster/build Misspelled the module_dir in the previous checkin. (It was moduledir) Removed a debug line from the Makefile, which is not necessary anymore Added cman-kernel patches for the new 2.6.9 kernel. Added a target to edit the release numbers when preparing tarballs. Updated version number Changed the build process for rgmanager so it more closely resembles the Added make directory for rgmanager Fixed rgmanger build process Updated configure scripts to include a share directory (required by rg_mamanger) Added makefile Removing accidetally include defines.mk Fixed configure script so it would configure rgmanager. Added a script for building srpms from the individual components. RPM spec file for magma. Fixes to the scripts for building SRPMS. Added rpm spec file for gnbd Fixes in the script to build srpms. Added rpm spec file. Fixed variable in rpm spec file. Changed srpm building location. Added spec file for fence Added spec file for dlm-kernel Fixed dlm.spec.in Added spec file. Fixed directory locations Moved GFS-kernel.spec.in to gfs-kernel.spec.in Moved specfiles to proper name. Updated location of srpms. No error if srpms already exists. Created a .PHONY tag for srpms. Fixed name of tarball to install. Uninstall script didn't uninstall gnbd.h properly. Fixed problems w/ make uninstall not working w/ rgmanager. Added fence_bladecenter.8 & fenced.8 to the install script for fence manpages. Added option to install init scripts. Added install/uninstall to main Makefile for init script. Updated toplevel Makefile to install init script. Fixed a problem which wouldn't uninstall the initscript. Added install of init script in top level Makefile. Updated copyright code Added slibdir into the toplevel make so we don't touch /usr/lib Merged changes from RHEL4 branch to fix building if not installing. Added forgetten line in configure script which caused --sharedir Added LDFLAGS variable in Makefile so dlm_tool will build if libdlm_lt is not installed. Changes to the way the local build works: file README was initially added on branch RHEL4. file TODO was initially added on branch RHEL4. file configure was initially added on branch RHEL4. file defines.mk.input was initially added on branch RHEL4. file release.mk.input was initially added on branch RHEL4. file uninstall.pl was initially added on branch RHEL4. file Makefile was initially added on branch RHEL4. file dm-cmirror-client.c was initially added on branch RHEL4. file dm-cmirror-cman.c was initially added on branch RHEL4. file dm-cmirror-cman.h was initially added on branch RHEL4. file dm-cmirror-common.h was initially added on branch RHEL4. file dm-cmirror-server.c was initially added on branch RHEL4. file dm-cmirror-server.h was initially added on branch RHEL4. file dm-cmirror-xfr.c was initially added on branch RHEL4. file dm-cmirror-xfr.h was initially added on branch RHEL4. file dm-log.h was initially added on branch RHEL4. Removed directories not needed for RHEL5. Removed gulm from HEAD. Removed gulm directory as it is no longer used. Removed dlm-kernel & gulm from configure script. Removed gulm from Makefile. Removed most references to magma from gnbd. Fixed fence so includes the proper directory for cman. Added ccsincdir and ccslibdir to facilitate building. Updated include to properly find gfs_ondisk.h. Temporarily disable quota build until functioning w/ gfs2. Modified makefile to use instead of hardcoded directory. - We don't use the copytobin target in libgfs2 Fixed install script to install appropriate binaries. Initscripts should be installed in /etc/rc.d/init.d Remove references to magmalibdir & magmaincdir. Assign ccslibdir to the appropriate variable in the configure script. Use ccslibdir instead of libdir to find ccs libraries. Install files in the correct location. Not necessary to specify /sbin. Fix configure script so we don't try to pass ccsincdir & ccslib dir to Makefile fixes to assist with rpm building. Fixed building for x86_64. Added -lpthread to LDFLAGS to fix bz #198187 (Unresolved symbols w/ ldd -r) Reverted changes to fix 64 bit arch building. Copied gfs_ondisk.h from gfs-kernel to allow builds to succeed. - Install the init script in the correct place. Added gfs_ondisk.h to allow builds outside of tree. Don't force the owner to root (breaks rpm build). The gfs package should be installing the umount/mount.gfs links. Create symlinks for mount.gfs & umount.gfs. - Fix for bz #206325, ccs should not be started with the '-X' option & return - Added in fixes to make gfs-kmod compatible with the RHEL5 kernel Update building for xvm fence agent to build cleanly in brew. Added changes to support installing init scripts w/ brew build. We don't want to delete the scsi_reserve init script when doing a make clean. Need to include directory for ccs.h header file. Fixes to prevent compile time warnings/errors in brew. Added date. (test git commit) Test git commit. Removed newline. Test git commit. Added back in change to description line to make chkconfig work properly. fence: fixed a fence storm with fence_egenera cman: fixed makefiles to actually install the vmware manpage
Christine Caulfield (98): cman3 commit [CMAN] Don't ignore cman_tool version [CMAN] Remove deleted nodes from our list Merge branch 'cman3' Fix multicast display in 'cman_tool status' Initialise votes to 0 [DLM] Don't segfault if lvbptr is NULL [CCS] Fix the config loader for good [CMAN] Make cman cope with the new objdb structure [CMAN] Free up any queued messages when someone disconnects [CMAN] Limit outstanding replies [CMAN] Don't declare a variable in the middle of a block [CMAN] valid port number & don't use it before validation [DLM] Mention lidlm_lt in the man page Remove references to broadcast. [CMAN] Save the new expected_votes when a node is removed [MISC] Make it build with gcc 4.3 [FENCE] Make it build with gcc 4.3 [CMAN] Disallow a new dirty node from joining the cman cluster [CMAN] Remove external dependancies from config modules [CMAN] Fix localhost checking that I broke last week. [CMAN] make qdisk compile on i386 [CMAN] fix cman_tool join -X [CMAN] Don't busy-loop if we can't get a node name [CMAN] Fix some compiler warnings on 64 bit systems [CMAN] use list_iterate_safe when removing nodes [CONFIG] Add ldap configurator [CONFIG] Make ldap put totem in the right place [CONFIG] Improve LDAP error reporting [CMAN] Add a config update callback [CMAN] Only do timestamp check for older nodes. [CMAN] Fix logging options [CMAN] Remove some redundant code. [CONFIG] Add some more ldap comments [CONFIG] Add ldap loader [CONFIG] rename ldap config generator [CONFIG] Add a man page for confdb2ldif [CMAN] Remove some spurious prints [CCS] Set errno when an error occurs. [CMAN] Don't use logsys in config modules. Revert "[CMAN] Don't use logsys in config modules." [CMAN] Don't use logsys in config modules. [CCS] Fold ccs_test into ccs_tool and tidy [CCS] add -c flag to ccs_tool query [CONFIG] Add some more errnos to libccsconfdb [CCS] Set return status on failure [CCS] Make ccs_tool/ccs_test more consistent [CMAN] Fix overridden node names [CMAN] pass COROSYNC_ env variables to the daemon [CMAN] Display the node's votes in cman_tool status qdisk: fix compile error when building without debug. cman: Revert dirty patch cman: exit if configuration check fails. cman: tidy objdb_get_int cman (mainly): use corosync cman: Fix find_handle leak cman: fix objdb-destroying typo cman" load openais services by default cman: Silence some compiler warnings. cman: add cman_tool -A to disable load of openais services cman: Return quorum state in a STATECHANGE notification cman: return the correct length of a message cman: Allow a recently left node to join cleanly. cman & config: Move special cases out of config modules cman: cope better with malformed config files config: fix ldap load bug caused by new objdb ordering in corosync config: Remove stray fprintf cman: Initialise variable cman: honour the dirty flag on a node we haven't seen before config: Allow multiple top-level keys cman: Copy "service" keys down to corosync config: Get rid of files I committed accidentally. cman: rename 'move' functions to 'copy' cman: Clean shutdown_con if the controlling process is killed. cman: fix a couple of unhandled malloc failures cman: fix two_node startup if -e is specified cman: Some fixes for configless running cman: Add some more comments about shutdown cman: Some edits to the cmannotifyd man page cman: Tidy some english phrases and typos cman: replace high_nodeid with votes in transition message cman: fix signatures of cman_get_privdata & cman_set_privdata cman: Don't crash cman_tool nodes -a cman: make it compile with latest corosync cman: Make cman the quorum provider for corosync cman: Make cman-preconfig reload <totem> too cman: fix memory leak cman: add cman3 services cman: drastically improve startup errors cman: fix cman_tool join return code cman: make 'cman_tool leave -w' wait until cman has shut down cman: Return an error if 'cman_tool leave' is attempted during shutdown cman: let 'cman-tool leave -w' wait even if shutdown has already started cman: Make new services compile with latest corosync cman: more corosync changes cman: fix error checking in testcmanquorum1 cman: Make API calls work on an inquorate system gfs2: Fix includes for building on rawhide
Daniel Phillips (35): Initial add, csnap code and docs Added devspam.c device data verification utility Actually commit (cvs, why are you so broken?) Socket-over-socket control interface now functional Create virtual device by execing dmsetup Now works with unmodified device mapper, hack removed Target now asks for connection when it needs one Change to named socket for device control connection Get rid of startup race by having csnap server fork its own daemon after Renamed service.c to agent.c Move development to 2.6.8.1 Remove out of date 2.4.26 kernel patch Updated design document with details on server failover Add "manual" csnap server failover Teach diff about dm-csnap.c and dm-csnap.h Sigh. Get the gfs bits out of the patch Add agent.c and sendagent.c Agents now instantiate servers using a dlm-based protocol Fix csnap_create so that csnap_destroy recovers dm devices on error Added kernel snapshot read lock upload for failover - Csnap server now has journalling and recovery, it seems to work Turn off tracing output, remove extraneous debug output - bug fix: remove premature optimization that breaks when blocksize < chunksize Turn off tracing, remove server crash simulation Fix struct client pointer stability bug in server too - Fixed struct client pointer stability bug in server too Whoops. BTree one-pass delete is now incremental with leaf/node coalescing BTree snapshot delete refactored and cleaned up, tree collapsing added, a Add index buffer dirtying, block freeing to snapshot delete. More cleanups. Add snapshot store expansion Add libpopt command line args parsing to mksnapstore, csnap-server Poptize csnap-server, mksnapstore the rest of the way Initial add, ddraid add interface to specify initial socket as ascii fdnumber to avoid exporting sys_socket and sys_connect
David Teigland (1410): ChangeSet@1.1682, 2004-06-24 13:34:54+08:00, teigland@redhat.com ChangeSet@1.1684, 2004-06-24 23:22:05+08:00, teigland@redhat.com ChangeSet@1.1683.1.1, 2004-06-25 11:19:18+01:00, patrick@jeltz.pjc.net change build/install order some tidying and improve error messages put lockfile in /var/lock/ by default use copy_from_user to get the register name allocate name space for max length, not actual length include lockspace.h Fix the way we do lkb deletions. The DELAST flag is removed and a update from src files a bunch of assertions to catch errors early on another warning and assertion include directory sequence numbers in prints better log message fix log message minor changes to resource directory lookups. - Patrick's fix to set rsb nodeid to -1 during unlock crept into a remove and clean up some debugging - when a dlm_unlock() removes the last lkb from an rsb, reset the - when an assertion fails dump all rsb's and lkb's to console update from src files loosen assertion to allow us to send an rsb lookup to ourselves pjc found spot where lock struct wasn't being freed - in unlock get the rsb after holding the in_recovery lock otherwise an in dlm_lock_stage1 set lkb nodeid to the rsb's nodeid only after the - set lkb nodeid to -1 when it's created. this prevents an assert during a recent checkin incorrectly switched unlock_stage2 to reference an Naming changes. Get rid of struct typedefs and gd_ prefix. Improve some remove stray printk mess with tabs (unaligned in cvsweb...) Change the way lm_dlm_unlock() works. We previously demoted to NL on update from src files - change in the way name_to_directory_nodeid works. get rid of the bitmask - /proc/cluster/dlm_dir to dump resource directory remove "resdata" from function names since we've renamed things get rid of the _recovery version of dir_lookup which only related to the recently changed version of release_rsb() checks the new Three different problems resulting from recent change to quit doing a dlm_unlock_stage2: copy new lvb from lkb before granting new locks to - the new MASTER rsb flag wasn't being set during recovery when update from src files print rsb in remote_stage2 assertion recycle dir entries when rebuilding res directory instead of freeing and pass the dlm_header into add_to_requestqueue so we can print the cmd if deserialise_lkb finds the lkb already exists, advance the pointer so tidy a bit of code that decides if a new master should be looked up. In-progress unlock requests interrupted by a recovery event weren't if remote_stage2() gets a request, it should return EINVAL if: update from src files initialize cluster_is_quorate to 0 add BUG() to assertion An unlock request is waiting for a reply from a master node. That on assert failure dump dlm's debug buffer - Improve/fix the way we lock the rootres list. The two rwsem's we update from src files fenced was not correctly matching fqdn's from cman and basic hostnames remove a printk that can be annoying When a request completion is received in lock_dlm during recovery, the - clvmd_fix_conf.sh usage was wrong die if we cannot talk with ccs when starting up Do reference counting on lockspace structs. Looping mount/umount tidy DLM_ASSERT statements Add locking around ls_requestqueue as it's accessed by both print the actual error value ccs returns retry calls to ccs_connect if they return an error Send a DROPLOCKS callback to gfs once every minute while the number of The lock counter used for DROPLOCKS callback wasn't accurate. It now update from src files - when processing a combined delete and completion ast, do the deletion try to clear up the build steps A bunch of fixes related to dir lookup's, EEXIST errors from lookups, remove another assert When granting locks, ignore locks on the wait queue that are also on plain make in cluster dir doesn't work, but make install should Make sure plock lp is off all lists before deleting, doesn't appear to basts don't apply to the "plock update" lock, so don't provide a change flag name to reflect expanded use better format for lock dump on assert - rebuild_freemem() was being called too early more refining of whether we deliver or skip bast delivery "cman_tool version -r <config>" can be used to update the copy the current config_version into the cl_version struct in when a recovery interrupts first start, the last_start value needs previous fix incomplete, one other copy to zero man page for cluster.conf move cman_tool.8 into man dir, add Makefile to install install man dir moved to man dir remove assertion that's not correct when multiple lock requests are use the force=2 option for dlm_release_lockspace() to disregard any update from src files update from src files some spacing cleanup Use a separate DELAY_RECOVERY flag to postpone join/leave processing if the first start is cancelled by a stop before finishing, reset the a couple error messages Use a recovery daemon per lockspace. This avoids deadlock when nodes use kthread routines to start/stop ast thread patches for 2.6.8.1 change the order in which we set up join/leave values so another Change the lock granting logic to mirror the behavior of VMS. The Changes to how dlm flags are used based on dlm changes. update from src files remove a couple log_debug's use kthread routines gfs needs a positive error from plock_get if there's a conflict use kthread routines - if a first start was aborted after getting past node setup, the Put a lock around start and exit of dlm_recoverd thread. add a couple syslog messages when nodes are fenced get rid of "dir entry exists" messages change the way we check for local plock conflicts An agent's error message should now be recorded in syslog (worked in my use down_interruptible, fcntl may return EINTR always take the ast arg from dlm_unlock(), even when NULL we must provide the correct astarg to dlm_unlock now that NULL is valid lock_dlm needs to return EAGAIN itself update from src files sync up flags with dlm.h Avoid doing a synchronous unlock in unhold_lvb by just letting the another shot at correctly stopping dlm_recoverd thread when fix dependency use simpler kthread routines for serviced update per build improvements by cfeist and anticipating a 2.6.9 kernel the journalid code inherited the hold_lvb changes for gfs; it's not A slight variation on one of the recovery special cases wasn't a "minimum gfs" howto that I wrote up a long time ago: We need to call process_requestqueue() when finishing a first recovery udp, not tcp mention how to use a different port for cman Update to the new plock lm interface and 2.6.9. This fixes the problem Change the default drop-locks value to 50,000. Both the drop locks update from src files update A little optimization I've intended to do forever: get rid of the Cached null locks that had been used with plocks were being freed allow drop_locks_count of zero to disable drop callbacks altogether allow max nodes to be set from /proc/cluster/lock_dlm/max_nodes remove rcom debugging code that we're not using Add a semaphore to serialize recovery with mount-group portion of missed a couple spots when removing rcom stats remove 2.6.8.1 patches update from src files Allocate an lvb for a new master rsb during recovery if any VALBLK Set the VALBLK flag on the NL locks used for gfs's hold_lvb. Not mixing these changes should fix the rare problem of waking a recoverd thread add an extra line to syslog to help track what's happening freeing a value before using in debug print add a log_debug line fix can_avert_fence() function, was using cman api incorrectly Bug in names_equal() caused nodes to never be identified as "first We should wait for the fence domain join to complete before allowing Add more intelligence to fence daemon to avoid fencing nodes Three config options for fence deamon can be set in cluster.conf or Allow delays of -1 to indicate forever. The lvb wasn't being copied into the master-copy lkb on unlocks, The VALNOTVALID flag was being incorrectly cleared during recovery in remove warning and a couple unnecessary type casts While waiting for a manual ack, also check to see if the node has Update per recent changes and new preferred option: -n <nodename> add content Fix some problems in the recently added lvb and valnotvalid flag munge usage wording don't include old license dir more helpful error message - add code to support unfence option Add -u option to unfence the node. Does nothing if unfencing missed clearing VALNOTVALID flag in unlock case ccs request began //nodes/ instead of //cluster/nodes/ begin ccs requests with /cluster instead of //cluster another tag name change new cluster.conf tag names zero the lvb when it's invalid since the dlm will not necessarily do Use sequence numbers to restore the most recent lvb copy during recovery DEBUG2 should be off by default lvb sequence number should be incremented by dlm_unlock only do lvb recovery when a node is removed from the lockspace ignore the S option that fence_tool uses and may be inherited A bunch of improvements: add fence_tool man page Add an msleep(500) to dlm_recoverd in an attempt to avoid the add fence_tool - update cluster startup tips, including new fenced delay/options clean up join_ccs code Re-order some initialization so that ccs errors will be caught and don't leave if gfs is mounted minor update slight correction on how leave remove works Two new flags that can be used for gfs's ANY flag: ALTPR and ALTCW. use dlm alt modes for LM_FLAG_ANY update from src files update from src files prefix output with prog name add debugging output for -D and a check to see if ccs connect works add -D debugging option A node gets 1 vote by default when no votes value is specified. - slight correction in description of expected_votes document the optional nodeid setting add sections on setting panic_on_oops and sources of more info When rebuilding an lkb on a new master, the lockqueue_flags value Release rsb's semaphore before queue_ast() since queue_ast() can Log the name of the node fencing is deferred to or say "prior member" Use ccs_get_list instead of ccs_get to prevent infinite looping when change -s to -n in syslog message use sigsuspend instead of pause as the reliable way to wait for a signal yesterday's signal changes broke "fence_tool leave" which requires we use die() macro so prog_name is prefixed to error messages in startup check add a loop around the cman GETNODE ioctl since it fix freeing name string before printing it in syslog - Leave the lockspace when gfs does a withdraw. For now we abandon - don't ignore the completion callback for a canceled lock log more info to syslog to help people see why nodes are fenced remove double
add cman_tool actions status|nodes|services to display document new actions wait, status, nodes, services send all debug logging over local socket so fence_tool can monitor 'fence_tool monitor' will print debug logging from fenced improve an error message include time in debug output Fix dlm_astd hang in bz 145090. Have dlm_astd skip lkb's in the ast Change all log_all() to either log_debug() or log_error() since improve chances that withdraw won't get stuck by: return an error in process_join_stop instead of asserting ignore any wakeup that arrives after the serviced thread exits There are a couple of potential problems this should fix related to cman_tool now reports an error if the nodename doesn't match exactly undo last change (enforce matching nodename/cluster.conf) so we can need copytobin If recovery was aborted during restbl_rsb_update(), some rsb's could be - Assert that the recover_list is empty at the start of dlm_dir_clear. remove utils_srt.c In rcom_send_message, return an error from midcomms_send_message remove three log_debug lines that are called so frequently during - liblvm2clusterlock.so has moved from /lib to /usr/lib so we need: - exit with an error if an invalid votes value is found Similar bug to the one fixed the other day. If recovery is aborted remove unused "bulk lookup" function Clean up changes from last commit related to bz 143487. cman_tool now picks as the local nodename whatever name has been entered don't do a dlm_lock_dump when an assertion fails When a conversion request has been sent to a remote master, that We add NL locks to a resource to implement gfs's hold_lvb(). When remove multihome setup from man page document -w and -q options for join Add option to fence_tool to wait for the node to complete its join and don't complain when we see fence_tool's -w option remove stray line The current manpages are written in plain nroff which is not parsable document the wait (-w) option list fence_tool Blocking asts were being ignored for all locks being converted which We were ignoring blocking callbacks for locks being converted The current manpages are written in plain nroff which is not parsable Make dlm_recoverd thread a permanent fixure of each lockspace. remove some non-critical printk's ignore fence_tool's Q option "The attached patch makes it possible to ask fence_tool to not wait for Recognize and resolve a second form of conversion deadlock. When When locks on the convert queue are granted, we need to try again to grant Remove unfencing since it needs to be reworked and won't be ready checkin removing unfencing wasn't complete, sorry Add a bunch of schedule() calls to potentially-long loops in remove "lkb xxxx exists" message which can flood the console/log. Ignore any NEWLOCKS or NEWLOCKIDS messages from a previous instance If node isn't found in sm_members report an error and don't oops. New dlm development that won't be functional with gfs for a while. Program for managing dlm membership in dlm-kernel/src2. #define misc name device node stuff for ioctl, largely copied from dm standard copyright matching dlm_nodeid_addr/dlm_addr_nodeid fix small things to get working Fix problems with cancelation. We weren't dealing with waiting locks When gfs does an lm_cancel() we need to do a dlm cancel if that's misc small bits, handle >1 local addr. copy cancel fix from src/ When shutting down a lockspace we can remove lkb's from the ast_queue copy ast_queue fix from src files make the weight arg optional Use sysfs for lockspace control. dlm-member ioctl's now only used for Use sysfs for lockspace control. And set_id is a new action command line interface for libdlm operations create/release/lock/unlock complete lock code split some functions into another file so they can be shared fix output in /sys/kernel/dlm/<name>/members cman doesn't provide proc entries remove usage of proc - use the new timeout variety of wait_event more addr length fixes need the sockaddr struct, not a pointer increase DLM_ADDR_LEN to 256 bytes -- must be at least as large moving lock dumps to debugfs, copying previous proc.c and ipoib_fs.c making nodeid an int consistently hooks things up to debugfs deal with some errors better a simple daemon to listen/talk to dlm in the kernel, connecting it dlm_astd in wait_event_interruptible wasn't being woken by fix release and convert actions; use persistent flag on locks use ls_debug_list for debugfs change sleep(1) to sleep(5) between a failed fence and a retry version 2 of the central locking logic, eventually to replace more work more work, largely refcounting related On Fri, Mar 25, 2005 at 03:22:38PM -0800, Daniel McNeil wrote: another small bit for rsb refcounts include event_nr in done message to groupd couple fixes use info from cman to do set_local/set_node calls into the dlm couple fixes error message add error message simplify dlm_astd handling by removing the wait_queue; and avoids Add version number to the start of all dlm messages to help with an example of using dlm_tool backup recent work back up work backup recent work some fixes from testing add files changes to existing code corresponding to new lock.c remove files add file don't need cluster link copy other hash function from gfs which works as well, but fix some bugs bug fixes better way of returning some errors, more recovery bits bunch of fixes several fixes couple fixes and thread to free rsbs change to per-ls the list of lkb's waiting for reply, split a couple new rcom routines that use normal lowcomms get/commit_buffer outline for last recovery stage that replaces rebuild.c same schedules that were added to the RHEL4 branch so serviced doesn't refcount fixes, now requires post 2.6.11 version of kref_put comment out noisy logging misc minor updates changes for debugfs complete more of the bits that are replacing rebuild.c misc recover fixes update lvb recovery function clear up confusing names quit waiting for fenced to join (-w) after 10 seconds if fenced next version of lock_dlm, a bunch of stuff is moved to userspace a couple fixes don't do useful work in an assert macro fix bug in recover_master_copy add missing wake_up reject invalid event_nr's on start cancel waiting locks; remove last lkb ref in both revert/remove_lock simplify the ioctl bits now that they're only used for setting node ioctl changes dlm_node.h replaces dlm_member.h dlm_member gone most of the byteswapping can now use openais use correct libcman byte swapping - make functions static update copyrights preparing to build from linux/drivers/dlm/ process group manager include groupd.h wasn't closing fd's for sysfs files lock_dlm daemon manages mount-group membership in userspace userspace version of kernel's list.h misc fixes don't confuse lock_dlm's uevents as ours recover_done, not done is sysfs file name unmount bits build stuff update make patches patches used by 'make patches' to generate complete dlm.patch remove print formats we're not using queries are waiting until after the first round a couple updates lock_dlm now exits, and pool doesn't Improve logic that delays and reduces fencing. When fenced is recovering use prefered inline pack_rcom_lock/receive_rcom_lock_args was missing lkb_status use long to save pointer val Move the __lvb_operations table into lvb_table.h so it can be remove sbf flag - remove hierarchical sections that were commented out so it can all seems EXPORT_SYMBTAB isn't used any more remove query bits that were commented out of device code update description of dlm.h flags get rid of DLM_RELEASE_NAME remove some complexity Fix one of the big fixme's: immediately after finding the lkb in Pass nodeid/addr info directly from node_ioctl.c to lowcomms where - grant_after_purge only on master rsb's free rsb's on toss list during recovery so they don't need to be return an error if no local addr's are set In-progress down-conversions should just be completed at the start reduce debug noise Initial untested code for recovering conversions between PR and CW. adjust more fixes move some recovery-related functions from lock.c to recover.c read_lock will do when creating root_list update FIXME comment "But, in any case you might as well move the label 'top' inside use msleep and ssleep instead of schedule_timeout clean up some spots where we don't need error handling, complete the - get rid of parentheses around defined values some misc tidying, use printk log levels newlines misc other formatting and tidying from reviews depends on INET, select IP_SCTP was hoping to avoid this, but search on every lkid we create to make scan/toss_secs dlm_config values, get rid of empty lockspace_exit put the '*' just to the left of the structure field also return -EBUSY for convert if lkb_wait_type is non-zero don't wrap wait_event in an infinite loop and use a timer - dynamically adjust the delay when polling a node for its status, go back to schedule_timeout() instead of msleep() in dlm_scand, - set lkid in user's lksb earlier so they don't see 0 - check if lvb alloc failed spell out DLM at the top - need to lock_rsb/unlock_rsb in recover_locks() because ast_queue_lock can be a spinlock patch from Steve Dake to avoid using clm library need del_timer_sync() with the new timer-based dlm_wait_function() improve debug logging skip barriers for now - changes to debug messages make the waitqueue usage more standard style stuff, mainly making lines under 80 do a schedule/retry when kernel_sendmsg() returns -EAGAIN, kernel apps must specify the lvb size they'd like (multiple of 8 bytes) specify lvblen when creating lockspace, 32 bytes byte swapping wrong size in rcom struct fix a couple remaining over-80 lines tidy a few things improve comments and remove log_print about partial messages status replies include a new "rcom_config" struct that is used move is_master() to lock.h and use it in recover.c NEW_MASTER flag wasn't being cleared on rsb's remastered locally ignore start events if lockspace is running to avoid an assert failure - big reordering of functions in lock.c to avoid declaring some Tiny code change to implement significant optimization: if a node make purge_queue() more general purpose, no functional change send a status request to every member at the start of recovery just adjust size of write, remove prints use cman barriers again spinlock fix from patrick Wait until all locks are recovered before doing lvb recovery. dlm_recover_members_wait() and dlm_recover_directory_wait() moved Move recover_rsbs() to the main recovery section, requires a new remove the timeout in wait_function which was for debugging, - get all lines under 80 fix up some comments - remove a couple FIXME questions remove trailing white space end files with
change debug message don't want .orig from patch no parens around defined values daemonize lib interface for groupd don't add connection if accept fails add _GPL to EXPORT_SYMBOL remove unused function change read_lock to read_unlock add to printk's don't add duplicate local addresses add Makefile use libgroup really use libgroup fix sprintf's add daemon and dlm_tool add group_leave arg missed new file add missing make_args new version using libgroup and libcman don't build fence_tool, not updated yet clear old recovery flags (LOCKS_VALID) when recovery begins better debug output munging changing things so group_join/group_leave are initiated by join/leave now just send messages to fenced which must already be add comment build fence_tool return 0 on success from group_join/leave various fixes completing switch to libgroup - remove the group struct when we're done leaving the call to process_recover_msg() had been commented out during re-enable byte-swapping of messages fix leave initial bits to report group listing put back flags to ignore certain messages, other minor stuff replace with new version of lock_dlm in devel/ remove devel files some byte-swapping calls were still commented out Implement join/leave info: a small string of app-specific data that an make spectator and withdraw options visible through sysfs implement group_get_group() to get info for single group munge group data returned in query recognize get_group request in daemon, use memcpy for info data, align text fix uninitialized pointer in daemon, simplify code in the mount path complete some missing bits; verify cluster name at mount, verify zero padding for id avoid double free of rsb's lvb when clearing lockspace Dynamic journal ids, done here using a simple message through get libdlm.h from the correct place within the tree don't build deprecated cman and sm dirs remove cman-kernel usage build src2 instead of src don't build cman-kernel add makefile build in lock_dlm install groupd and group_tool clean and install in group dir list new daemons that need to be started not doing patches Need to check the cn_member field to tell if a node is a member, use libcman to check if victim has rejoined copy all the data from cman_get_nodes memcpy all the data after cman_get_nodes dlm needs 2.6.12, don't build until that's out fix-dlm-without-debug.patch fix-dlm-extern-lvb_table.patch tidying, do ifdefs around debugfs functions in a consistent way restart events after a delay install daemon Set ls_first for gfs when a spectator is first to mount so gfs changes to debugging output, don't try another leave if one is a lot of fixes, use messages instead of barriers from libcman option to list only one group don't depend on leave state here, always go to groupd tidy error handling export dlm_lvb_operations symbol so dlm_device module can use it changes to debug logging only print debug info to stderr when -D is used bits for withdraw new file remove temp debug bits, add 'dump' option to get debug log from groupd remove noisy debug line only print debug lines to stderr when -D is used use correct list field in group struct kobject was being freed too early in withdraw Work around gcc-2.95.x macro expansion bug (from akpm) munging to match upstream correction of export symbol lvb_operations remove repeated include of module.h header - include everything that needs sending in the standard message struct close fd's of dead clients start group id's at 1 instead of 0 look through correct list for failed nodes needing recovery fix calculation of previous low nodeid recovery timer can't be global, it must be per-lockspace don't need to include lvb_table.h correctly interpret the return value of do_barrier when freeing locks for withdraw, don't try to free fake lvb's All lookups outstanding when recovery happens need to be resent after When an outstanding lookup is re-processed after recovery, the MASTER_WAIT Add two comments in set_master() explaining how things work. adjust some log_error() messages and open syslog when daemonized look through correct list (members_gone, not members) for recovering This patch makes needlessly global code static. Use node weights in directory node mapping. A node is responsible default weight is 1 not 0 Use ccs to get a node's optional weight value. Replace test_bit(), set_bit(), clear_bit() of rsb flags with Big rework of lockspace control/management. Simplifies things different sysfs hooks for controlling the dlm New way of controlling the dlm, no longer mirrors groupd callbacks. Resolve potential recovery problems in dealing with an rsb prior to dlm builds on 2.6.12 get to compile on 2.6.12 file COPYING was initially added on branch STABLE. file VERSION was initially added on branch STABLE. Per-lockspace option for dlm to run without using a resource directory. need to copy the rsb's hash into the remove message If recover_locks() on an rsb doesn't find any locks to recover, file INSTALL was initially added on branch STABLE. incorrect logic in telling kernel when join/leave was complete Need to release the list of root rsb's when recovery is aborted early. When a lockspace on a remote node is not found for a recovery fix Makefiles for install Add new phase to wait for application acks following a stop callback. ack stop callbacks add "first_done" sysfs file for dlm_controld to be notified of When we're the first mounter, wait for "first_done" (set by Last change was not correct, it's a second mount we need to delay -n used with gnbd commands use a constant message size between libgroup and groupd add global "joining" var needs to be per lockspace - simplify a bunch of old junk wait to add second mounter until gfs's initial recovery is done on Significant cleanup, lots of style stuff kernel people like function type and name on same line function type and name on same line, use list_for_each_entry tidying and filling out comments fix mistake from converting to list_for_each Carry out Ken's instructions from gfs2_ondisk.h to change cleanup and tidying function type and name on same line is preferred style remove parens from defined values, fix lines over 80 remove more define parens Remove ENTER/RETURN macros used for profiling and tracing debugging. depends on SYSFS munging to fix lines over 80 and other odd line breaks use list_for_each_entry don't build unused debug header use kthread functions Only allow one of the lock_dlm threads to do blocking callbacks. use msleep() instead of schedule_timeout() replace __inline__ with inline remove/change comments that are out of date or unnecessary fix typo bug from list macro conversion remove define parens remove unused options reorder/indent for readability add 2005 to copyright typo bug "&atomic_read()", remove & get lm_interface.h from ../harness get lm_interface.h from . quotes around lm_interface include used to generate kernel patch where we need <linux/dlm.h> adds gfs2 to kernel build generate new kernel patches include "locking/harness/lm_interface.h" To avoid compile warning, replace get_v2ip(aspace) = NULL with The max num_glockd was lowered from 32 to 16 when we switched to using depend stuff include gfs2.txt in kernel patch add file for kernel patch replace spaces with tabs remove trailing whitespace In conversation, simply refer to "GFS", not "GFS2". Add a comment with Ken's explanation of the diaper device. more tidying, removing old or unnecessary comments inline instead of __inline__ Go back to schedule_timeout() in the daemons. We want to wake up more add static to acl_get() adding static to a bunch of stuff found by sparse, and a __user __user annotations for proc functions __user annotation in gfs2_readlink() Excerpt from Ken Preslan's "ramblings". spaces to tab drop unnecessary casts of void pointers gfs2_random() is not used, remove it. replace gfs2_sort() with sort() from linux/sort.h for gfs2_disk_hash() use the kernel's crc32_le() instead of our own gfs2_disk_hash.h contains gfs2_disk_hash() and crc table that's Remove all memory debugging per lkml comments; a pity, this stuff was callers of inode_create() can deal with error, don't need RETRY_MALLOC remove { } creating code block within function due to complaint Get rid of RETRY_MALLOC entirely, although the one place it couldn't replace the hash functions with linux/jhash.h style munging: Use wait_event() in do_lock_wait() instead of managing the waitqueue get rid of fixed_div64.h -- the existing do_div() works fine in my - remove empty kerneldoc headers remove another { } block Remove context dependent path names. Requested by Al Viro. comment to link flags to struct member convert lops.h macros into static inline functions convert vma2state from macro to static inline function asm headers after linux use kzalloc instead of kmalloc+memset replace kmalloc/kzalloc with kcalloc where appropriate replace 256 with defined MAX_LINE fix whitespace damage where lines start with 1-7 spaces followed by a tab c99 initializers replace 0x7FFFFFFFull with MAX_NON_LFS in size checks Remove diaper code, add a FIXME in the place where we need to gfs2_ip2v(ip, NO_CREATE) -> gfs2_ip2v_lookup(ip) remove functions unused now that diaper is gone cool down on the copyright artwork remove comment I added copyright artwork remove empty kerneldoc headers depends on IPV6 Configure lockspace id and members through configfs instead of sysfs. Configure lockspace id and members through configfs instead of sysfs. don't compile dlm_tool, configfs changes mean it won't be working Configure node addresses through configfs instead of ioctls. Move setting of the lockspace id from configfs back to sysfs where set node weights (now through configfs) - when weight isn't set, it should default to 1 support more than 1 address per node, up to DLM_MAX_ADDR_COUNT (3) fix problem with the previous change for multiple addresses rmdir, not unlink, to remove node from lockspace functions no longer exist dlm_our_nodeid now from config.h munge whitespace to match upstream so patches are sane remove temp defn of kzalloc use new schedule_timeout_interruptible update member_sysfs.c has been decimated and is no longer related to use linux/jhash.h instead of our own select configfs Add printk log levels by replacing printk with one of: hold configfs subsys lock while accessing the children list remove PRI/SCN defines that aren't used comment out CMAN_CMD_ADD_KEYFILE which isn't in header yet memset node struct to 0 before cman_get_node clear node struct before every cman_get_node follow_link now returns void* These will replace proc.[ch] and do the same thing through sysfs. fix compiler warning about ignoring return value of inode_setattr Replace the procfs hooks for list/freeze/withdraw with sysfs hooks. use sysfs instead of procfs for list/freeze/withdraw do nothing in withdraw (instead of BUG) until the dm hooks are ready get to compile on 2.6.13 Pass plock requests to user space lock_dlmd through reads on misc device, Get plock requests from kernel through misc device. Allow "id" for the lockspace/fs to be set through sysfs; use macro to define sysfs attributes revert accidental change update kernel patch generating target update for new plock header update for use on -mm kernels remove old patches replace gfs's switchable endian conversion functions with plain remove oopses_ok option since a panic results regardless, no difference no more oopses_ok option no more oopses_ok don't need to include smp_lock.h patch from Mike Christie replacing PRI defines slim down the printk's in assertions Replace TRUE/FALSE with 1/0. This is a really unfortunate blow to Pass all gfp flags to gfs2_holder_get() instead of having GFP_KERNEL Don't reset atomic statistics counters to zero when they roll into get rid of glock_hold/glock_put, use gfs2_glock_hold/put everywhere more munging of gfs2_assert more gfs2_assert munging more trivia, replace define with enum replace more defines with enums tidy recurse_check remove "get_cookie" ioctl, not used any more tidy line breaks replace PRIx64 with llx apply same fixes here from gfs comments: enums, better assert fill in the standard stuff to send plock_get to user space define GFS2_FSNAME_LEN to use in place of 256 misc style stuff we dumped the gfs endian conversions and use le everywhere put back previous insmod printk remove extra tab remove bio counters that were kept by the diaper device minor fixups move a bunch of stuff from ioctl to sysfs Pass the kobject for the fs into lm_mount() so the lock module's The fs kobject is now passed into the lock module at mount time. add a couple basic sysfs attributes missed the sys_fs_uninit at unmount add
update to work with new sysfs organization handle sysfs dirs for both gfs1 and gfs2 move shrink and statfs_sync from ioctl to sysfs Use kref structs and operations to do glock reference counting instead Add comment explaining how queue_empty() is used. update FIXME comment don't include the kernel's lm_interface.h, need smp_lock.h for lock_kernel() define LM_OUT_ERROR for a lock module to return When the dlm returns an error, return LM_OUT_ERROR to gfs instead if the lock module returns LM_OUT_ERROR, withdraw from the cluster missing include withdraw if the lock module returns LM_OUT_ERROR add a static remove some log_debug's refresh for latest -mm use list_for_each use list macro file WHATS_NEW was initially added on branch STABLE. Showing extended statfs info in sys set a couple people off, Add a couple comments. adds list_for_each_entry_safe_reverse to list.h Use list_for_each_entry_safe_reverse. last checkin for list_for_each_entry_safe_reverse was incomplete use ALIGN instead of MAKE_MULT8 Port plock management from lock_dlm kernel module to lock_dlm userland declares fs_subsys for /sys/fs/ register gfs under /sys/fs instead of /sys/kernel. base dir in sysfs is now /sys/fs/gfs2 /sys/fs instead of /sys/kernel comment out bits that used "statfs" entry in sysfs that's now get block size from sb instead of defunct statfs move quota sync and refresh to sysfs Add comment with Ken's quota summary. The start of a mount.gfs2 program that is called by mount(8). This untested interaction with lock_dlmd added default to 1 vote if no value given in cluster.conf Remove "do mount" uevent message to user space and waiting for "mounted" Accept connections for mount requests from mount.gfs. Do mount Pass the correct mount options. need EXPORT_SYMBOL_GPL umount helper On unmount just leave the lockspace and exit, user space umount.gfs split depends line Write our own reduced option parsing routine and don't bother using build mount.gfs2 and umount.gfs2 do mount option munging on a copy of gfs's hostdata buffer not sure why these are still here don't use patches any more Remove some unused functions, make others static. dlm_find_lockspace_name is now the static one Every file should #include the headers containing the prototypes for Share common code between mount/umount; both about finished. add fixme comment look in /proc/mounts instead of /etc/mtab for gfs mounts Code to add/del gfs entries from /etc/mtab; junk I was hoping to extraneous
mount.gfs/umount.gfs are mount.gfs2/umount.gfs2 debugging output and options don't restrict to gfs2 deal with gfs1/gfs2 differences better remove option debugging When using lock_dlm, all gfs2 unmounts would panic in Check in Kevin Anderson's mount sync patch: Check in Wendy Cheng's fix: typo: not &'ing DLM_LKF_PERSISTENT with the flags Check if allocate_lockinfo() fails. Munging and renaming related to the merging of lock_harness into Incorporate the lock_harness into gfs itself, just as was done for gfs2. printk prefix GFS instead of GFS2 in merged harness code make lock modules use modified harness interface and header from gfs Copy the get/set vfs<->internal casting macros from stable branch to update gfs2 description New groupd that uses the cpg (closed process group) service from openais. Remove the "info" params to join/leave and other related functions. Removed group_join/leave arg. Remove group_join/leave arg. Remove group_join/leave arg. Need to find a new method for detecting add nodeid to message deliver callback Put this code back into the correct state. (This is not supposed to include headers from ../../cman/daemon/openais/trunk/include misc fixes getting it to work handful of fixes rework things so we should be closer to passing message delivery misc fixes more progress, can now process a join bits for sending messages to the group code for passing through messages, setting global id's, and remove unused file Munging to get this compiling again; need to define new annotated clean out a few unused bits Send messages (used for distributing journal id's and plocks) Remove some old state stuff that doesn't exist any more Comment-out withdraw stuff which depends on libdlm which doesn't match debug messages with the same ones in other daemons Fix leave processing, leaving node can't wait for stopped messages Fix the check for us being in the fence domain, libgroup update for new components Fixing leaves. don't need to include dlm kernel headers use "gfs" instead of "lock_dlmd" as the group type printed by group_tool munge header for group listing - also build libdlm note to help clean up in case umount(8) doesn't call umount.gfs2 add debugging to determine if the mount syscall is stuck - When we leave a lockspace, remove the configfs entries for it. Only send jid's to new mounters, not everyone. don't need to zero out end of addresses any more Uncomment withdraw stuff that uses libdlm. when a cpg connection is closed, also clear out the poll client return an error from group_dispatch() if the read doesn't return check for errors returned from group_dispatch() missed a line which broke the compile Add event state to the set of group info we provide, and have set the width of the name field dynamically so state info won't Add -w option to disable the withdraw feature. More descriptive debug messages. - Basic recovery works. - command string needs to be "setid", "set_id" wasn't recognized check that the mount point exists and is a directory Fix how we dispatch cpg callbacks; dispatch one per poll() event. api changed to unsigned id - Add extended event information that group_tool -v will print - add version numbers to messages Because cpg leaves are processed asynchronously, we can't use the Purge messages that get queued after we're added to the cpg group - don't process new non-recovery events while recoveries are Enable withdraw functions by default since libdlm locking is now Rename the daemon binary from "lock_dlmd" to "gfs_controld". changing lock_dlmd to gfs_controld in various places lock_dlmd now called gfs_controld Add debug output showing the changes we get from cman. When nodes are removed from the cluster, remove their dir in sort listed groups by level - If, when mounting, we receive the nodeid/jid message before processing add to error message Respond to cman's TRY_SHUTDOWN callback. Reply to cman's TRY_SHUTDOWN callback. Daemons now exit when cman says the cluster is down. don't let the gfs_controld lockspace prevent a cman leave always reply yes to a cman shutdown, letting the daemons themselves Exit if we get POLLHUP on cman fd. Exit if we get POLLHUP from cman. When starting up, also clear any old configfs dirs out of When we exit because the cluster has shutdown, do a force release If the user specifies hostdata options on the command line, they need mount point for configfs is /sys/kernel/config, not /config mount configfs at /sys/kernel/config, not /config This is a significant rewrite and expansion of the code that Add remount support which involves updating our mount mode valid jid is >=0 not >0 A start_done() was missing in the case where the last rw Add some more tips, and an example cluster.conf If the last rw mounter umounts leaving only ro nodes, the remaining byte-swap id's in inter-node messages Remove the restrictions on when readonly mounts are allowed. As long This should help fix a possible hang caused by a node failing at just Fix hang if unmount overlaps recovery slightly differently, i.e. if Look for gfs_ondisk.h in ../../gfs-kernel/src/gfs/ instead of now need to build cman before ccs improve some debug messages debug statements Major changes in handling recovery for failed nodes. Previously, munge debug messages recovery result values for success/gaveup reversed When a recovery event arises while processing another recovery event, When we get the extraneous recovery_done for our own journal when If a bug in a client results in them doing a startdone at the We can get more than one node reporting success in recovering when multiple nodes failed together (in one event) we were only in withdraw the group name wasn't being parsed from the table name %ll print format for handles looks like we need to build: it's not right to use "init" to determine when to save an options This should fix the trouble caused by nodes rejoining the cluster fix logic that reduces debug output An unused recovery set needs to be cleared in either cpg or cman build all order: cman/lib, ccs, cman make some names more descriptive of what they do Untested fix for the case where nodeA fails, nodeB fails while When mounting a fs, we first join the mountgroup, then tell mount.gfs If there are no devices defined within a node's method, that method Fix some possible problems with overlapping recoveries, and cases When more than one node failed at once, creating an extended recovery set stopped flag for rev->nodeid, not ev->nodeid - id's where being removed from a recovery event's extended list Comment out code that waits for kernel mount, it's not working Handle case where none of the current members of the mountgroup When a node is joining a group, has been added to the cpg, but print out the buffer for debugging when we can't parse it when sending the results of local journal recoveries, we weren't Changes to the way recovery events are handled when we're already in decrement the node count before printing debug info so we see the only print debugging info to stderr if -D was used -- this may have retry cpg_join() when it returns ERR_TRY_AGAIN check that the specified mount type matches the actual fs type on disk - re-enable the code that waits for the kernel mount in the finish() when unrecovered nodes are set to be recovered again after sequential when finishing an unmount for a node, we shouldn't be checking it's Set up a separate cpg for sending messages (e.g. for processing - get kernel types like __be64 and __u64 (used in gfs2_ondisk.h) by - get NETLINK_KOBJECT_UEVENT definition from kernel's netlink.h there is no dlm_tool any more move dlm/daemon/ to group/dlm_controld/ remove daemon/ files, now in group/dlm_controld remove lock_dlm/ files, moved to group/gfs_controld/ run configure in group/ don't syslog non-errors update some of the build steps change log_error to log_debug for non-error for group_tool query, fill in members list from app perspective, Significant reworking of how mounts are processed. The previous Complete the code to support withdraw, not yet tested. This also remaining withdraw bits now in recover.c Moving the cluster infrastructure to userland introduced a new problem - keep cman member list updated by using cman callbacks instead don't skip fencing a node unless it's both a cman member and has - sort out which messages should be log_debug/log_group vs openlog("groupd", LOG_PID, LOG_DAEMON) for syslog entries Don't finalize/terminate a local group leave until we see that all need to include ../make/defines.mk to get {sbindir} definition add standard script improvements to debug messages now that we copy out app member list for viewing, set the remove debug printf don't process new join/leave events without quorum retry cpg_join and cpg_leave if error is TRY_AGAIN posix_test_lock() args updated for 2.6.17 Fix install fix makefile fix compiler warnings - extra checking and debugging when events get backlogged gfs_controld_connect error values are < 0, not 0 - build against installed openais/cman headers and libs to be consistent, <libcman.h> instead of "libcman.h" build against installed headers and libs for cman and openais build against installed cman lib and header steps to download/build/install openais and libvolume_id tarballs put back old check that previous commit avoided complain and ignore a cpg confchg reason we don't understand dispatch_fence_agent() was prototyped and called with an extra arg set DESTDIR when installing openais no more dlm_device module fix up group_tool dump which was broken fix dump len so we don't complain debug print of the full uevent string from the kernel - memset to 0 arrays of arg pointers add libgfs remove duplicate line keep a 1MB circular buffer of debug messages, they can be dumped out add option to dump debug messages from gfs_controld using node A may get a start cb and send a started message, and node B may fix up debug logging Use system includes instead of including from configured kernel_src. do distclean in group/ do distclean set cmanlibdir for group some trailing )'s were left out needed _safe version of list_for_each_entry when moving entries uncomment bullpap and ipmilan use cmanincdir when building gnbd when a kernel mount fails and we leave the mountgroup, we need to if mount.gfs is unmounting/leaving the group because the kernel mount have gfs2/Makefile install/uninstall mount and umount binaries itself remove duplicate from a couple log_debug/log_error From: fabbione@ubuntu.com keep 1MB circular buffer of debug messages that can be sent to a 'group_tool dump fence' will dump fenced's debug buffer Update the cman member list every time we call is_member(). When - use nodeid and owner when checking the owner of a plock instead of 'group_tool dump plocks <fsname>' can now be used to display all - checkpoint usage for plocks is getting closer, basic writing/reading do byte-swapping before freeing a group struct, sanity check it's not referenced in - complain and ignore checkpoint sections with a bad size bring lm_interface.h in sync with the version in gfs2 Some basic stuff that I hadn't realized I'd not done back when update lm_interface.h from version in git tree free all plock state for an fs when it's unmounted use the correct (global) handle when unlinking a checkpoint if a node has a saved ckpt when it unmounts, it needs to unlink it The idea to have the last node that did the checkpoint try to reuse it don't send plock debugging to stdout with -D, use -P to get that now log_debug() when we receive a withdraw message report mount failure debug message earlier There's been a relatively unusual problem explained in the comments that show all options in help output remove a couple log_error's Code that starts groups in order of level during recovery wasn't daemons that depend on groupd (fenced, dlm_controld, gfs_controld) don't barf on unused args don't barf on extra args errors opening sysfs files are normal/expected in many cases, so change log_plock() to log_group() for packing/unpacking plocks in don't barf on unknown option arg after unlinking a ckpt, don't try to close it if we don't have it open, change debug messages related to storing/retrieving plocks to/from when the low nodeid fails, the checkpoint needs to be unlinked, - the check for us becoming the new low nodeid after the previous one expand the number of cases where we don't tell gfs-kernel to do recovery When we're in X_BEGIN state, accept "stopped" messages from other - break from snprintf loop when buffer is filled when we set a recovery event back to the FAIL_BEGIN state, make tidy up a couple style things When deciding whether we need to unlink the checkpoint and resend journals convert write(2) calls to use do_write() which handles EINTR and no void arg in dlm_get_fd prototype was causing warnings handle short or interrupted reads/writes, an snprintf instead of sprintf, - minor change to the delay we add between each cpg_mcast retry use same retry delay on cpg sends as gfs_controld, usleep(1000) remove stuff from dlm/nolock/harness since it all comes from upstream undo junk mistakenly added by last commit Use the event_nr arg provided in start_done to check if the start_done update per the gfs2 upstream changes to the lock module interface: handle short/interrupted writes/reads Fixes a really stupid bug checked in yesterday that causes groupd have groupd set the scheduler to RR priority 2, same as gfs_controld positive return code from recover_current_event() should just indicate Get lm_interface.h from the kernel instead of keeping a Add debugging in four areas to help us know more quickly when something Adding -vv to the groupd command line will result in a log_debug put a message in syslog if we get a cpg error that we can't deal with set the "member" field in the group_data struct that's returned updates gfs-kernel (gfs1) and gnbd-kernel are going to track the RHEL5 kernel update don't configure gfs-kernel or gnbd-kernel now that they're not built make the number of clients a global variable so it will be easier - check cpg flow control status from openais when processing plocks This is a big batch of code that gets us further along the path to add -p option to completely disable plocks/ckpts if we get a plock request from the kernel when plocks are disabled, Handle the case where we're the second node being added to the group If cpg_join or cpg_leave are stuck in a retry loop, put an error replace spaces with tabs Handling a lot of hard situations in the areas of: The corresponding changes to the gfs_controld changes in handling Fix an effect of recovery mixed with joins where the node whose join A node that was just added would incorrectly conclude that the node fix style badness fix typo in debug message typo, deleting "rs" instead of "re" when cleaning stuff up Recent changes to mount scenarios (mounts while another node is doing we weren't cleaning everything up for a client upon POLLUP Patch from Abhi to fix case where a node's mount is rejected by other clean up gross code Clear out configfs dirs that we've created before exiting. recent commit fixing bz 210344 removed the memset so we're When a new master joins the mountgroup, it retrieves plocks from Add plock rate limit option -l <limit>. Current default is no limit (0). Default plock rate limit of 10 instead of 0. uncomment scheduler settings fix sched_priority from sdake if read() returns a non-EINTR error then shut down the client if read() returns a non-EINTR error then abort The plock rate limiting code should use the full timeval to measure fix a couple of problems if openais enables flow control: use timersub() macro to subtract timevals instead of coding it handle errors or short reads when reading /dev/misc/lock_dlm_plock if mount fails, don't try to save the mg info for the new group the fix yesterday to prevent a segfault when mount failed mistakenly From: Steven Dake group_tool dump doesn't handle partial reads/writes, Be more intelligent about handling recovery sets so we can deal with change the default plock rate limit from 10 to 100 Before doing the mount-group portion of withdraw, fork off a dmsetup to Pass gfs_controld the device being mounted, it'll use this if it When lockfs is called from the vfs (due to a dm suspend), don't try Call into the lock module to do a withdraw instead of just calling BUG. very useful testing program I wrote a long time ago tidy up some prints add lock_flood/unlock_flood/unlock_flood-exit commands to test doing groupd's function that returns info for group status queries was Switch from CMAN_DISPATCH_ONE loop to CMAN_DISPATCH_ALL to resolve Switch from CMAN_DISPATCH_ONE loop to CMAN_DISPATCH_ALL to resolve revert last checkin When the first mounter is recovering all the journals, it should use Fixes related to the needs_recovery state and first-mounter recovery. Support mounting a single fs on multiple mount points. mount/umount modifications of /etc/mtab weren't smart enough added "flood n mode" function a while ago, doesn't equate to pjc's groupd creates uint32 global id's for each group. It doesn't add -K option to enable dlm kernel log_debug's Move memset(0) into the for loop so we're clearing the data buffer test program like gfs's 'alternate' but using an lvb instead of a file. join lockspace, optionally sleep, leave the lockspace. If the only two groups were two dlm lockspaces, then during recovery, clear configfs stuff if we get SIGTERM, this is a convenience if you remove the self paramter (sync up with RHEL5 branch) lots of changes, biggest is new "stress" test updates un-comment-out gfs-kernel and gnbd-kernel since they now build on latest version, "stress" test running correctly Look for a protocol setting in cluster.conf dlm section, and set Use realpath(3) to canonicalize path names for device and mount point. Check right away if the kernel has gfs/gfs2 support by looking in change some mount error conditions to log_error() instead of log_debug() various changes Look in cluster.conf dlm section for protocol, timewarn, and log_debug Make new features available based on recent dlm kernel patches. add dlm_tool, can be used to join/leave lockspace use dlm/Makefile to build lib and tool dirs don't do gfs_sb_print() if we don't detect a gfs fs, it often just Add dlm_ls_deadlock_cancel() that allows a system daemon to cancel bunch of stuff to test new features report an error if no lockspace name is provided return a different error number to mount.gfs for each specific failure translate different error numbers from gfs_controld into specific, (copy from RHEL5 branch) Return 1 or 0 GETLK result to the kernel for conflict/no-conflict. log an error message if we see mount.gfs killed before it's done Block SIGINT (^C) around the three steps of mount: s/unsigned long/unsigned long long/ - add more specific warnings/errors when connecting to gfs_controld fails Various small changes and additions. Munging formatting to avoid add a bunch of casts to quiet warnings on x86-64 Make gfs-kernel compile against post-2.6.22 (2.6.23-rc) kernels. add lockdump and option to set permission of dlm device when creating recent cleanup of warnings should have specified unsigned in long long Brute-force porting to 2.6.23-rc1. There are non-trivial changes for add new code to find and resolve deadlocks, still incomplete, disabled dlm_tool deadlock_check <name> is a way to manually kick off a deadlock minor updates for cluster-2.01.00 sdake says that DESTDIR=/ is correct, not /usr Remove check_sys_fs() since it breaks on-demand fs module loading from the fill in a couple more bits related to canceling the chosen lock don't add the same transaction to a waitfor list more than once Detection and resolution now works with my basic deadlock tests. put back the ability to do pid-based deadlock detection on 5.1 kernels clean out junk that was only relevant to rhel4 clean out some options that were only relevant to rhel4 Update fence, fenced, fence_tool and fence_node man pages which were add man pages mention fencing override, describe the structure of node fencing add man page minor updates Outline the basic ideas of multiple methods and multiple devices. Vastly simplify this man page. Include no cman or fencing information install dlm_tool.8 add makefile install in man dir handle addition/removal/failure of nodes during a deadlock cycle Call the new cman_set_dirty() api to disallow clusters both with update ccs man pages the NODIR new_lockspace flag was always being used, even if the -d the -m mode option was being ignored and 0600 always used proper help output for -m option comment out the new cman_set_dirty() call; it's not working mention group_tool should be used instead of cman_tool services use an admin handle from cman to call set_dirty add new test for deadlocks fix attribute xml format for cluster_id and keyfile rewording and embellishing some bits related to openais.conf needed to be a little more thorough in taking a canceled transaction dstress fixes I think I added this years ago, forget why Reject mount attempts on an fs that's still in the process of unmounting. report that a mount fails due to an in-progress unmount go back to a default of -O0 instead of -O2 to get the stuff with -Werror forgot the 0 after the -O Do nodedown events when the confchg for the groupd cpg arrives, go back to default of -02 now that -Werror problems are fixed The output of 'dlm_tool lockdump' could make it appear that a granted used wrong define, DLM_LOCK_ instead of LKM_ Improve the dumping of debug logs from daemons. don't setup netlink if deadlock is disabled xid needs to be unsigned long long ASSERT was doing fprintf(stderr) which goes somewhere we don't want A performance optimization for plocks. This speeds up locks that are Testing revealed a couple more races I hadn't expected. change some log messages new plock ownership related stuff fix %llx printf warnings using (unsigned long long) odds and ends not commited bz 429546 updates groupd: purge messages from dead nodes dlm_tool: print correct rq mode in lockdump libdlm: fix lvb copying dlm_controld: new version dlm_controld: quorum checking libdlm: max name length sanity dlm_controld: max name length sanity gfs: don't cancel glocks when writing to hidden file gfs_controld: retry recovery for withdrawn journal fenced: new version fenced: more new devel dlm_controld: build plock code fenced: new libfenced interface fence: using new libs fenced: process queries in a thread fenced: allow queries during fencing; group queries fence: fence_tool list and fenced_domain_nodes() fence_tool: fix list command libdlm: use linux/dlm.h from 2.6.26-rc libdlmcontrol: new lib interface to dlm_controld dlm_controld: fix build problems in previous commit libdlmcontrol: filling out code dlm_controld: filling out code dlm_controld: code for info/debug queries dlm_tool: add libdlmcontrol query commands daemons: mostly daemonization stuff daemons: queries dlm_controld: fix waiting for removed node dlm_controld: options to disable fencing/quorum dependency dlm_controld: dlm_tool query fixes dlm_tool: refine list output dlm_controld: remove unworking re-merge detection dlm_controld/gfs_controld: ignore write(2) return value on plock dev dlm_controld: use started_count to detect remerges gfs_controld: rename files gfs_controld: move recover.c gfs_controld: restructuring gfs_controld: new version dlm_controld/gfs_controld: minor fixes gfs_controld: basic fixes fenced: revert logsys commits fenced: use logsys fence_node: use simple logsys api fenced/fence_node: use SYSLOGLEVEL fenced: link with liblogsys gfs_controld: support queries from gfs_control gfs_controld: add query code gfs_controld: add journal for new node fenced/dlm_controld/gfs_controld: ccs/cman setup fenced/dlm_controld: fix quorum waiting fenced: tune logsys settings groupd: sync daemon setup/structure with others fenced: enable new logsys mode flag fenced: fix logsys define dlm_controld: set id before recovery gfs_controld: change start message from new members gfs_controld: add missing endian conversion gfs_controld: byte swap ids earlier gfs_controld: close dlm_controld connection fenced: improved start messages fenced: munge config option code fenced: debug logsys options dlm_controld: improved start messages fenced: complete messages copy start messages fenced: munge logging dlm_controld: use logsys gfs_controld: use logsys dlm_controld/gfs_controld: add logging.c file groupd: use logsys groupd: detect group_mode fenced: use group_mode detection dlm_controld: use group_mode detection gfs_controld: use group_mode detection fence_tool: add domain member checks dlm_controld: allow early fs_register gfs_controld: register with dlm_controld earlier libdlm: remove device node creation/removal dlm_tool: handle all join flags group_tool: use mode from groupd fenced: finishing off query stuff dlm_controld: queries in libgroup mode libdlm: handle truncated device names gfs_controld: queries in libgroup mode dlm_controld: fs_register and fs_result fixes dlm_controld: kill the cluster on misbehaving nodes dlm_controld: fix nodeid in fs_result gfs_controld: fix fs_notify during recovery dlm_controld: open dlm-monitor misc device gfs_controld: kill the cluster on misbehaving nodes fenced: kill the cluster on misbehaving nodes groupd: remove detection of uncontrolled kernel dlm and gfs dlm_controld: isolate cman and fence code fenced: add skip_undefined option gfs_controld: ignore dlm uevents groupd: fix daemon quit on SIGTERM fence_tool: new option to delay before join init.d/cman: use fence_tool -m for two node clusters fenced: joining daemon cpg to bypass fencing fenced: handle merge of cpg partition dlm_controld/gfs_controld: handle merge of cpg partition libdlm: /dev/misc/dlm-control created by udev fence_tool/dlm_tool/gfs_control: improve ls output format mount.gfs: fix mount error handling gfs_controld: ignore second leave gfs_controld: fix and implement remount dlm_controld: ignore old plock dev when using new one gfs_controld: withdraw and recovery fixes gfs_controld: ignore uevents after first_done gfs_controld: add protocol negotiation dlm_controld: add protocol negotiation fenced: add protocol negotiation fenced/fence_tool: improve list info fence_tool/dlm_tool/gfs_control: remove error message daemons/tools: misc minor cleanups and improvements dlm/fence: daemon fixes and tool improvements gfs_control: improve ls output fenced/dlm_controld/gfs_controld: modify a debug message dlm_controld: fix plock dump groupd/fenced/dlm_controld/gfs_controld: init logging after fork gfs_controld: move log_error message fenced/dlm_controld/gfs_controld: query thread mutex gfs_controld: simplify misc device handling and fix plock dump dlm_controld: enable calls into deadlock code dlm_controld: fix the recent realloc fix in deadlock code dlm_controld: join should return error without fence domain libfence: remove ccs reconnect code libfence: no logging dlm_controld: clear plock syncing flags fence_tool: refuse to leave if dlm lockspaces exist group_tool: show groupd compat info fenced/dlm_controld/gfs_controld: config update reread libccs: update ccs_read_logging liblogthread: new options liblogthread: time stamp when entry is added fenced: new logging stuff groupd: new logging stuff liblogthread: improve thread handling dlm_controld: new logging stuff gfs_controld: new logging stuff liblogthread: add LOG_MODE_OUTPUT_STDERR groupd/fenced/dlm_controld/gfs_controld: log macros fence_node: use logthread groupd/fenced/dlm_controld/gfs_controld: don't retry ccs_connect groupd/fenced/dlm_controld/gfs_controld: startup info messages groupd: libcpg mode can skip some more libgroup stuff fenced/dlm_controld/gfs_controld: log exiting message only once fenced/dlm_controld/gfs_controld: error handling in groupd detection fenced: log protocol message type liblogthread: do nothing without init groupd/fenced/dlm_controld/gfs_controld: log logging settings dlm_controld: recv error checking libccs: fix ccs_read_config groupd/fenced/dlm_controld/gfs_controld: get logfile from ccs fenced/dlm_controld/gfs_controld: improve groupd waiting gfs_controld: cannot connect to dlm_controld error group_tool: fix dump gfs_controld/dlm_controld: fix lock syncing in ownership mode dlm_controld/gfs_controld: plock dump display resource owner gfs_controld: use new uevent strings dlm_controld/gfs_controld: plock config paths dlm_controld/gfs_controld: fix plock rate limiting dlm_controld/gfs_controld: dump unused resources dlm_controld/gfs_controld: read lockless resources from ckpts dlm_tool: lockdebug using new debugfs file dlm_tool: change to new debugfs scan gfs_controld: remove groupd compat gfs_controld: replace cman with cfg gfs_controld: cpg_finalize gfs_controld: don't exit from query thread gfs_controld: dlm_controld failure is cluster_dead gfs_controld: wait for dlm registration gfs_controld: fix lockfile gfs_control: ls failure should exit with failure gfs_controld: new libcpg api gfs_controld: gfs_control: fix shadow warnings gfs_controld: include mg name prefix in log messages gfs_controld: copy some fenced changes Revert "gfs_controld: Remove three unused functions" gfs_controld: watch cluster membership
Fabio M. Di Nitto (559): Use resrules-noccs in dtest build target Fix build on parisc Commit new build system as proposed and discussed on cluster-devel mailing list: Remove unused vars Fix gfs2 identity exit code path allow to specify --fence_agents="list of fence agents" both libccs and daemon were building and linking common/log.c. When the project switched away from magma, we forgot to enable IPv6 for pretty self explanatory, this code is not used anywhere. Get rid of it. Readd ipv6 support to ccs_tool update and add verbose option Remove dead code Fix build system. Thanks to Alasdair for spotting the error Rrestore the make dependencies within the same subproject (same as it Remove unused files. * Fix incdir usage across the entire tree so that: Fix LDFLAGS override: Fix dlm/tool install and clean target both gnbd and gfs1 need some love for .22.. Remove old dead code from the tree. Wave goodbye to libcman bits :) Make sure to cleanup the buffer when processing each request or dirty data Fix build on ia64 by adding a temporary workaround and make sure to wrap Overload Makefile to give Lon a build target and keep the style consistent Fix build on parisc as we did for ia64 Clean up some Makefiles that did not use proper openaisincdir and dlmincdir. group/ now depends libdlm. Express this new dependency in top level Makefile add clean: target or make clean will fail. Remove redundant gfs_ondisk.h from gfs/include/ and gfs2/include/ Allow the full cluster suite to build using external kernel source. Cleanup FOO_RELEASE_NAME to RELEASE_VERSION Fix build with gcc-4.2 Remove ddraid from CVS HEAD. Remove cs-deploy-tool from CVS HEAD. Remove fence/agents/xen from CVS HEAD. Add dlm/tests/Makefile -Wall is added by default in CFLAGS via configure to make/defines.mk. change the default CFLAGS to "-Wall -O2 -g". Remove obsoleted Makefile Clean up cman/tests/Makefile Cleanup gfs/tests/ Makefiles Cleanup group/test/Makefile This is the first patch of a long series to collect common Makefile targets Cleanup clumon/ as agreed on cluster-dev Fix build warning Remove old code. ACK by Lon Collect all common make targets for fence/agents written in perl Collect common make targets for fence/agents written in python So in this first patch (that seems the most urgent one): Fix more warnings when building with -O2 and also fix get_rmtabd_loglevel Fix configure and Makefiles to cope with kernel built with O=/path... white space cleanup Fix uninstall target Add support to allow disable the build/install targets for each specific switch permanently to perl -w configure: Backticks don't work in strings. Use POSIX::uname(). Fix configure to handle properly 0.x or x.0 releases. Use proper vars to disable targets Fix purely cosmetic typo Clean up duplicate ccs query paths Use right vars to print debugging info Use standard path var and memset it before each query Apply, rework and cleanup second part of patch from Marco Ceci to fix 354421 If votes for quorumd is _not_ specified in cluster.conf, then Add cman_wait_init as wrapper for cman_admin_init/cman_init and cman_is_quorate Remove cman_wait_init for now. It was becoming overly complicated for such Clean up STRIP usage. It is not consistent and we shouldn't strip random The new Makefile system never invokes LD directly (and this is a good thing). Be consistent across the entire tree on AR and RANLIB invocations Hard encode paths to (u)mount.gfs* Fix bugzilla 362031 Switch configure to use perl warnings and fix them up. Do not install stripped binaries Make sure we invoke virConnectOpen with a proper URI. NULL is deprecated Fix extracflags and extraldflags to be recognized as options or configure will Cleanup leftovers from the very old build system. We were using a very complex * globally rename BUILDDIR to SRCDIR to reflect what it really is. Fix 2 corner cases when setting up the objdir: Minor objdir rework to extend flexibility. Fix clean target Use newly defined $(OBJDIR) to source .mk files snippet. A long time ago we did start collecting common Makefile snippets in one Collaps all man Makefile's common snippets into man/man.mk Install forgotten dlm man pages. Remove obsolete and unused Makefile apply alpha sort :) Collapse all common clean: target bits into make/clean.mk generalclean: target. Remove unrequired distclean targets Convert to use make/clean.mk Collapse all install: and uninstall: targets in make/install.mk make/uninstall.mk Fix a few regressions introduced by the big Makefile clean up: Fix fence build dependencies. For too long we did rely on fence/Makefile Fix gnbd build dependencies. For too long we did rely on gnbd/Makefile Fix all: target. Once again change ifdef to fix fail to build on hppa/parisc Fix build with -DDEBUG Fix building when -DDEBUG is defined. Fix error reporting to aisexec. aisexec config parser expects error_string to be set also when we successfully Add interpreter to ocf-shellfuncs. Cleanup manpages to work with whatis. Fix buffer align. So far this one makes the entire stack run on sparc Fix clean target. core files have pid attached to them. makes it possible to change the default configuration file by setting Fix "off the source tree" install. This was a small regression introduced Fix mkdir invokation to not fail when dir already exists Fix alignment issues in rgmanager. This makes it possible to run rgmanager Whitespace cleanup Allow the resource to run on Debian/Ubuntu systems without manual patching, Add fake support for -r option at umount so we don't fail if gfs2 is not umounted Update gnbd kernel modules to build with 2.6.24 EXPORT_SYMBOL(xtime) has been removed in 2.6.24. Update gfs to cope with 2.6.24 export op changes and other bits fix gfs for the removal of sendfile and helper functions Remove unused variable Fix build warning Bump kernel check to 2.6.24 Remove obsolete file Red Hat bugzilla 244343: Bugzilla 227892: Remove unrequire functions. This follow gfs2 changes Whitespace cleanup Stop linking against unrequired libraries. Cleanup man page. Man page cleanup. Clean up qdisk man page. Fix http://bugs.debian.org/465790 Sync missing commit from RHEL5 branch: [CMAN] Move ccs config ais module into ccs/ccsais [BUILD] Fix configure script to handle releases [CCS] Upload all subsystem configs into objdb Add toplevel .gitignore [BUILD] Allow release version to contain padding 0's [CCS] Fix xml -> objdb config import [CCS] Cleanup duplicate vars from previous commit [CCS] Fix possible memory corruption on double free [CMAN] Drop dependency on libdevmapper [CMAN] Fix building when -DDEBUG is not specified [BUILD] Fix handling of version and libraries soname [BUILD] Update .gitignore for .o and .d files [BUILD] Set -MMD as default CFLAGS [BUILD] Fix man page install permission [CMAN] Fix config handling [CMAN] Do not duplicate entries in the objdb [CMAN] qdisk: add credits to Joel [FENCE] Enable fence_apc_snmp [FENCE] apc_snmp: allow paths to snmp binaries to be configurable [BUILD] Remove extra debugging entry [BUILD] Fix fenceperl and fencepy make snippets to allow multiple targets [BUILD] Add fencelibdir support [BUILD] add enable_experimental_fence_agents configure option [FENCE] Move apc_snmp README where it belongs [FENCE] Move apc_snmp README where it belongs [FENCE] Remove obsoleted fence_apc perl implementation [BUILD] Royal cleanup of the fence agents build system [BUILD] Enable build and install of experimental fence agents [FENCE] Fix fencelib to pring version and copyright [FENCE] Make sure to version and copyright all built files [BUILD] Fix clean target for experimental fence/agents/lib Revert "Fix help message to refer to script as 'fence_scsi_test'." Revert "fix bz277781 by accepting "nodename" as a synonym for "node"" [KERNEL] Update modules to build with 2.6.25 [GFS2] Fix build warning [BUILD] Add --enable_crack_of_the_day configure option [BUILD] Fix typo [BUILD] Set automatically cflags when building experimental bits [GROUP] Fix building with standard kernels Revert "gfs2_tool: Fix build warnings in misc.c bz 441636" [BUILD] Fix clean target [RGMANAGER] Fix build with gcc4.3 [GFS2] Fix build warning [rgmanager] Remove obsolete clushutdown utility [CCS] libraries should never log [CCS] Convert to logsys [MISC] Update copyright headers [MISC] Update Red Hat main copyright file [BUILD] Deal with the new libfence properly [BUILD] Fix install/uninstall targets for fence/agents/lib [BUILD] Allow users to set default log dir and syslog facility [CCS] Switch to use user selected logdir and syslogfacility [CMAN] Convert qdiskd to use logsys [CMAN] Use build/user defined default logging facility [CCS] Document -d (debugging) switch [CCS] Allow ccsd logging level and facility to be set by cluster.conf [BUILD] Deal with new libfenced [BUILD] Fix building with separate object dir [BUILD] Fix kernel check for good [BUILD] Fix fence lib install target [GROUP] Apply patch to make gfs_controld work with 2.6.26 [FENCE] Enable new fence agents by default [RGMANAGER] Fix uninstall target [BUILD] Fix build order. Gotta love circular build depends... [CMAN] Setup logging file [BUILD] Change build system to cope with new libdlmcontrol libdlm: fix libdlmcontrol in Makefile [CMAN] Do not query ccs as it might not be the right config plugin [CCS] Detach dependency on ccsd to run the cluster [CCS] Fix build with gcc-4.3 [CMAN] Set default syslog facility at build time [BUILD] Allow users to set path to init.d [MISC] Fix build errors with Fedora default build options [MISC] Fix more build errors with Fedora default build options [MISC] Fix even more build errors with Fedora default build options [BUILD] Fix install when building from a separate tree [MISC] Fix some gfs2 build warnings [BUILD] Require 2.6.26 kernel to build [GNBD] Update gnbd to work with 2.6.26 [GFS] Make gfs build with 2.6.26 (DO NOT USE!) [RGMANAGER] ^M's are good for DOS, bad for UNIX [BUILD] Move fencelib in /usr/share [MISC] Cast some love to init scripts [CMAN] Fix path to cman_tool [FENCE] Rename bladecenter as it should be .pl -> .py [DLM] Remove unused header file [BUILD] Add --without_kernel_modules configure option [BUILD] Free toplevel config/ dir [CONFIG] Create config/ subsystem [CONFIG] Add missing Makefiles [CCS] Make a bunch of functions static [BUILD] Stop using DEVEL.DATE library soname [GFS] Fix comment [INIT] Do not start services automatically [GFS] Sync with gfs2 init script [BUILD] Fix sparc #ifdef according to the new gcc tables [MISC] Update copyright [BUILD] Fix build order Merge branch 'master' of ssh://sources.redhat.com/git/cluster [BUILD] Fix dlm_controld linking [BUILD] Fix rg_test linking [BUILD] Fix install permissions [GFS2] Use proper include dir for libvolume_id [FENCE] Fix copyright header for fence_ifmib manpage [FENCE] Fix ifmib README to report the right fence agent [BUILD] Plugin the new shiny fence_ifmib agent [CCS] Use absolute path for queries [CONFIG] Fix lots of bugs in libccsconfdb [BUILD] Add fence_lpar fencing agent to the build system [GFS] remove symlink to umount.gfs2 [GROUP] libgfscontrol: fix build with gcc-4.3 [BUILD] Change build system to cope with new libgfscontrol [BUILD] gfs2 requires group to build [BUILD] Fix mount.gfs2 build [MISC] Make several API's private again [CONFIG] Add full xpath support to libccs [CMAN] Bump library version [BUILD] Switch libdlmcontrol back to shared library [BUILD] Collapse common library makefile bits in libs.mk [MISC] Remove obsolete and empty files [MISC] Add top level licence files [MISC] Cleanup licence, copyright and header duplication [MISC] Tree cleanup [BUILD] Prepare infrastructure for perl/python bindings [GNBD/FENCE] Move fence_gnbd agent where it belongs [MISC] Update top level copyright file [BUILD] Fix file permissions all around [MISC] Whitespace cleanup [MISC] Relicence rgmanager/src/resources/oracledb.sh under GPLv2+ [GFS] Remove obsoleted gfs_edit in favour of gfs2_edit [MISC] Remove osl-2.1 exception from README.licence [MISC] Add original author for cman/qdisk/disk.c [MISC] Remove old copyright [MISC] Add another exception to COPYRIGHT [GFS2] Add missing include and fix build warning [QDISK] Add better support for Xen virtual block devices [CCS] Fix build warnings on sparc [CCS] Add missing CCSEXIT call [CCS] Fix priority setting [CCS] Fix a few logsys configuration bits [CCS] Remove duplicate code and make it common [CCS] Remove LOG_MODE_DISPLAY_DEBUG from logsys settings [CCS] Init logsys as early as possible [CCS] Shrink more common code for internal xml queries [CCS] Add cosmetic CCSENTER/EXIT for simple xml queries [CCS] Improve logsys init order [CCS] Fix improper log level on debugging information [CCS] Convert ccs logsys config to the ais format [QDISK] Fix build with new openais logsys [QDISK] Fix debug type [QDISK] Make get_config_data static [QDISK] get_config_data cleanup [QDISK] Remove duplicate debugging configuration [QDISK] Clean handling of debug envvar [QDISK] Init logsys later in the process [QDISK] Major clean up [BUILD] Fix new gfs_controld Makefile [CCS] Always check for debug setting as first thing [CCS] Fix debug override from command line vs config [QDISK] Port qdisk to the new logsys config interface [MISC] Logging: optimizing query sequence [FENCE] Start porting fenced to logsys [FENCE] Make fenced ready to load logsys config [FENCE] Move logsys configuration calls where they belong [CCS] Set debug from syslog_level only when requested [QDISK] Set debug from syslog_level only when requested [FENCE] Allow fenced to configure logsys [FENCE] fenced: separate concept of fork and debugging [CCS] Use common syslog facility [FENCE] fence_node: use logsys for logging to syslog [CMAN] Remove unrequired includes [FENCE] fenced: update man page [GFS2] hexedit does not need syslog [FENCE] fence_tool: document "ls" [CCS] Remove duplicate header [CONFIG] Make sure to reset xml index in not in list mode [CONFIG] Add cluster.conf direct loader [CONFIG] Fix several bugs in XML parsing implementations [BUILD] Add configure options for libldap [BUILD] Allow configuration of docdir [BUILD] Fix docdir default path [BUILD] Add install/uninstall snippets for documents [BUILD] Install ldap schemas and example in document directory [MISC] Documentation cleanup [BUILD] Fix install of telnet_ssl [BUILD] Fix telnet_ssl build [BUILD] Allow users to configure default built-in syslog level [MISC] Use default configured SYSLOGLEVEL across the tree [BUILD] Add make oldconfig target [MISC] Update .gitignore [MISC] Fix logging file query [CONFIG] Fix loadldap include [BUILD] Plug confdb to ldap tool [MISC] Create and install logrotate file [BUILD] Clean extra kernel modules files [MISC] Fix build with newer toolchain [CCS] Fix LEGACY_CODE ifdef [BUILD] Implement --enable_legacy_code in the build system [BUILD] Add ccs_test replacement when building legacy_code [BUILD] Fix ccs.h include path [BUILD] Fix doc install target when building objects outside source tree [CCS] Kill obsolted ccs_test [RGMANAGER] Port all resource agents to new ccs interface [RGMANAGER] Port smb resource agent to ccs_tool [BUILD] Fix race condition in oldconfig update/execution [RGMANAGER] Use proper ccs_tool query output [BUILD] Fix ccs_tool/ccs_test build with new compat code [CCS] Inflict hopefully last compat issues love to ccs_t* Revert "[RGMANAGER] Use proper ccs_tool query output" [RGMANAGER] Port ccs_get to proper ccs_tool output [RGMANGER] Fix call to ccs_tool [BUILD] Fix ccs_tool linking dir order [BUILD] Fix logrotate snippet filename [FENCE] Sync fence_apc_snmp from RHEL47 branch [BUILD] Fix LOGDIR usage [FENCE] Fix fence_apc_snmp logging [BUILD] Cleanup linking order for logsys [BUILD] Cleanup groupd makefile build: update .gitignore Revert "fence: port scsi agent to use ccs_tool query and drop XML::LibXML requirement" Revert "fence: simplify init script" Revert "rgmanger: remove check on cluster.conf from rgmanager init script" rgmanger: remove check on cluster.conf from rgmanager init script fence: simplify init script fence: port scsi agent to use ccs_tool query and drop XML::LibXML requirement rgmanager: fix clean target cman: init script should not user cluster.conf directly rgmanager: init script does not need network config config: allow users to override default config file in xmlconfig test commit Revert "test commit" bindings: add first cut of perl Cluster:CCS bindings: improve Cluster::CCS description build: clean up perl bindings build system misc: clean up "char const *" vs "const char *" init: standardize init scripts to /etc/sysconfig/cluster build: fix bindings build when using external object tree bindings: fix CCS.pm doc fence: remove unrequired headers from rackswitch build: fix several issues related to install and build targets build: drop "all" dependency from install: targets ccs: move to the new logsys init API qdisk: port to new logsys api build: properly respect non standard libdir and incdir ccs: turn more ccs_tool code into legacy code build: fix ccs_test symlink install target cman: make sure not to umount configfs when there are other users config: fix objdb2xml filtering rgmanager: unbreak locking in clulib libccs: add support for /child::*[%d]/ for xpathlite build: add support for corosync build: bump kernel requirement to 2.6.27 cman: make ccsd startup optional and allow override of config loader config: move ccs/ccs_tool to config/tools/ccs_tool cman: switch default config parser to xmlconfig ccs: libccscompat don't include unrequired header ccs: move debug.h to ccs/daemon ccs: move comm_headers.h to ccs/daemon config: move generic documenation and man pages to config/man ccs: move libccscompat into config/libs and mark it legacy code ccs: move ccsais plugin to config/plugins/ccsais and mark it legacy code ccs: move ccs/daemon to config/daemons/ccds and mark it legacy code build: define legacy_code=1 on clean target libccs: add support for /child::*[%d]/ for xpathlite qdisk: allow scan of sysfs to dive into first level symlinks qdisk: fix sysfs path diving build: create contrib/ top level section build: add contrib/Makefile build: plugin askant in our build system misc: remove exec bits from different files build: rename --enable_xen to --enable_virt build: add --without_config build option build: bump library soname to 3.0 cman: init script best to require $time build: fix clean target of contrib section config: make more functions static ccs: libccsconfdb header cleanup misc: init scripts clean up libdlm: major cleanup rgmanger: fix handling of VIP v6 ccs: deal with xml file format special case fence: update alom description fence: install fence_alom man page misc: cleanup ifdefs around RELEASE_VERSION dlm/fence/gfs: fix daemon spinning 100% due to memory corruption fence egenera: fix logging file rgmanager: fix build after port to logsys misc: cleanup copyright.... again misc: fix gfs2_edit build fence: update man page for fence_apc gfs2: randomize debugfs mount point gfs2: randomize file for savemeta operations gfs2: remove unused define rgmanager: randomize file for automatic data dump rgmanager: randomize ASEHAagent temp files rgmanager: move fs.sh log file where they belong rgmanager: move nfsclient.sh cache files where they belong rgmanager: move oracledb.sh log files where they belong build: reinstate targets in rgmanager metadata check rgmanager: randomize SAPDatabase temp file libgfs2: randomize creation of temporary directories for metafs mount xmlconfig: remove debugging fprintf ccs: implement config reload in legacy ccs cman: add /libccs/@next_handle support ccs: libccs major rework pass 1 ccs: libccs split ccs_lookup_nodename into extras.c ccs: libccs major rework pass 2 ccs: libccs major rework pass 3 ccs: libccs major rework pass 4 ccs: remove duplicate entry in internal header file ccs: libccs major rework pass 5 common: plug liblogthread in the system build: use standard syslog priority name rather than corosync ccs: add ccs_read_logging ccsais: fix buffer overflow when reading huge config files xmlconfig: fix buffer overflow when reading huge config files ccs: cleanup ccs_read_logging gfs2: randomize debugfs mount point even more gfs2: randomize file for savemeta operations even more rgmanager: move state dump file where it belongs rgmanager: randomize ASEHAagent temp files even more rgmanager: randomize SAPDatabase temp file even more rgmanager: randomize oracledb.sh temp file misc: fix mktemp usage rgmanager: randomize smb.sh temp file rgmanager: randomize svclib_nfslock temp dir gfs2: randomize creation of temporary directories for metafs mount more cman: make cluster_parent_handle function specific cman: implement reload operations for cman-preconfig cman: implement and simplify configuration reload operations cman: update man page for reload operations dlm_controld: handle cman config update notifications gfs_controld: handle cman config update notifications groupd: handle cman config update notifications rgmanager: Fix smb.sh shell scripting logthread: fix usage of syslog(3) cman: add new daemon for notification to custom bits cman notify: add call back to external script cman notification: add shell and build infrastructure cman notify: update init script fence scsi: plugin reload notification script into cmannotifyd cman notifyd: add man page cman notify: add note to man page cman notify: add script template to doc/ cman notify: add logging to cman_notify cman notify: fix a few bits in the shell area cman notify: wait for forked process to terminate.. cman notifyd: export quorum information on statechange ccs: libccs implement reload operations ccs: simplify libccs reload code build: prefer init scripts generated in the objdir rather than source init scripts: major rework to make them distro agnostic build: fix kernel module install dir to respect DESTDIR rgmanger: fix build system libccs: cleanup build: respect build: respect EXTRA_CFLAGS in cobj.mk libfence: use ccs_connect instead of force_connect. config: fix loading of multiple objects with no subojects build: allow libs to have indipendent sonames build: change error string gnbd: remove from cluster project xmlconfig: major rework cman: port notifyd to new logthread api build: fix missing ${logtlibdir} for linking build: allow system to use zlib in non standard paths cman: make init script stop cmannotifyd build: fix fence_node Makefile cman: reenable stderr output in notifyd fence: install virsh fence agent man page libccs: build with latest corosync dlm_controld: include saAis from openais build: fix dlm_controld makefile build: fix fence agents man page Makefile build: install fence_vmware_vi bits in the appropriate locations build: don't set exec bit on built files. build: install fence_vmware_vi_helper in sbindir build: fix typo and get rgmanager to build again build: fix fence_scsi installation bits dlm_controld: stop linking against logsys logthread: filter messages to stderr to respect logging priorities libccsconfdb: do not allow logfile_priority to override debug. ccsd: port to logthread infrastructure build: restore original behaviour when building groupd qdisk: fix mkqdisk output config: time to say goodbye to ccsd new gfs2-utils master test commit split tree into separate projects build: install libgfscontrol static lib and header file gfs2: improve init script build: fix gfs2 init script Makefile misc: Update copyright for 2009 build: remove reference to libgroup build: bump kernel requirements to 2.6.28 gfs2: fix binary and manpage names build: stricter install invokation build/init: install/create common dirs build: add FORCESBINT install/uninstall target gfs2: fsck and mkfs binaries should be in /sbin build: fix install target for SBINSYMT. build: propagate relative info about /sbin vs sbindir gfs2: use relative links misc: drop obsoleted bits build: fix doc Makefile to stub targets gfs2: sync gfs_ondisk.h from gfs1-utils gfs2: fix build warnings spotted by paranoia cflags gfs2: restore libgfs2.h vfprintf call gfs2: fix endian conversion gfs2: don't swab in place build: drop dependency on libvolume_id build: convert to autoconf/automake/libtool build: drop volume_id from autoconf build: cleanup autogen and stop warnings on configure build: create m4 dir upfront I tried to build using the latest autogen.sh and got errors. gfs2: drop dead test code gfs2: handle output conversion properly gfs2: add missing casts libgfscontrol: fix const warnings libgfscontrol: make functions static gfs_control: fix const warnings gfs_control: make functions static gfs_controld: fix function declaration gfs_controld: fix const warnings group: fix void arithmetic group: fix print formats gfs_controld: fix declaration warning build: drop build dependency on python build: check for libquorum build: relax autotools requirement gfs2: fix build failure misc: update copyright year across the board gfs2: make init script LSB compliant
Goldwyn Rodrigues (4): tunegfs2: Initial framework tunegfs2: Add operations to modify the superblock tunegfs2: Modify Makefiles/configure scripts tunegfs2: Change default install path /usr/sbin
James Parsons (34): Fixes bz172464; adds WTI RPS10 agent to build Fix for bz168698 Fixed typing designation for var Fix for bz176375 regression Added fence_bladecenter back in to install list added explanation of new auth type switch rsa support fixed typos man page for fence_rsa Fix for bz168698 Support for drac 4/I Addresses interface change in drac_mc firmware version 1.2 Fixed copy - paste error in usage file Makefile was initially added on branch RHEL4. file README_SNMP was initially added on branch RHEL4. file fence_apc_snmp was initially added on branch RHEL4. file fence_apc_snmp.py was initially added on branch RHEL4. file powernet369.mib was initially added on branch RHEL4. Makefile and cool new scsi agent renamed to match convention. Added support for new fence_scsi agent Added new fence_scsi agent support remove this file in preference for the version with filetype extension, like other agents. The Makefile generates the version for the sbin dir without extension. Removed BULL refs from man page addresses bz193065 ignored unused args from stdin Ignore unused args to stdin Support for DRAC ERA file fence_baytech.py was initially added on branch RHEL4. bz222234 fix for bz205457 fix for bz220946 New apc agent written in python that supports named outlets and outlet groups, minus perl pain. Addresses bz172179 and bz134489. yee haw Fix for bz238106, new firmware version issues Fix for 251358
Jan Friesse (25): fence: Fence agent for VMware ESX cman: Removed old Perl version of VMware fence agent, so new version is built. fence: Fix fence agent for VMware ESX. fence: Fix fence agent for VMware ESX. Fence: Added fence agent for Sun Advanced Lights Out Manager (ALOM) fence: New fence agent for Logical Domains (LDOMs) [fence] Fence agent for ePowerSwitch 8M+ (fence_eps) [fence] Fixed man pages makefile, so fence_eps.8 is now installed. fence: Added support for no_password in fence agents library and fence_eps. fence: Fixed case sensitives in action parameter. fence: Fix -C switch description in Python library fence: Operation 'list' and 'monitor' for Alom, LDOM, VMware and ePowerSwitch fence: Fix operation 'list' and 'monitor' for LDOM and ePowerSwitch fence: New fence agent for VMware using vmrun command. fence: IPMI over lan timeout adjusted and configurable fence: fix IPMI parameters containing special characters fence: fix IPMI typo in help fence: fix IPMI spawn /bin/bash rather than /bin/sh fence: fix IPMI man page fence: fix IPMI over lan to support ciphersuite select fence: Add libvirt (virsh) based agent fence: Set binary on telnet connections fence: Added fence agent based on VMware VI API fence: VMware VI helper path fix fence: VMware VI better handling of "strange" names
Jim Meyering (13): don't dereference NULL upon failed realloc * fence/agents/xvm/ip_lookup.c (add_ip): Handle malloc failure. * gfs/gfs_fsck/inode.c (check_inode): handle failed malloc remove dead code (useless test of memset return value) add comments marking unchecked malloc calls Remove unused local variable, buf, add comments marking unchecked strdup calls handle some malloc failures Use automake's sbin_PROGRAMS, rather than writing our own rules. don't dereference NULL for a hostdata string with no "=" hexedit: avoid NULL dereference upon failed malloc Revert "hexedit: avoid NULL dereference upon failed malloc" install mkfs.gfs2 into /sbin, not /usr/sbin
Joel Becker (1): libdlm: Don't pass LKF_WAIT to the kernel
Jonathan Brassow (153): - targets to build outside the kernel. - add ability to compile outside kernel - make gnbd buildable outside kernel - update make files - you meant clu_fence not cp_fence, I assume. - check in patch for making links correctly in make file - change log code to not print log_msg if verbose is not set. - work around broken exit - add targets to the make file for updating subtrees to latest tags. - couple changes to make files - remember to closedir() - update configure scripts to set %{libdir} to /usr/lib - sometimes getsockopt would report that bcast was enabled when it wasn't. - add some man pages - multicast option is not ready yet. Fail if it is specified. - when cman shuts down (as when cman_tool leave happens) close the connection - must zero out rset variable before populating. - specifying a cluster name is only valid when the force command is used. - add cmd to remove old libs that were located in /lib (as part of the - allow ccs to reload config file if the node is quorate (covers the - the sm plugin for magma seems to emit CE_NULL instead of CE_MEMB_CHANGE - misc prog for magma. ccs is located in 7, not 8 - ccs is located in 7 not 8 - incorrect synopsis - take a stab at the cluster.conf.5 man page - move some stuff around. - cman_tool.8 was still referenced in the cman_tool directory - it wasn't a good idea to add rm -f /lib/libmagma* to the make file - don't need the gfs_eattr man page, as the tool no longer exists - forgot to remove eattr from the Makefile - use indexing rather than ccs list handling to retrieve node names - correctly specify the libdir in configure as /usr/lib - a couple minor updates to multicast code - this should get rid of the ENETUNREACH problems... as well as the - fix memory leak + other minor anoyances - fix make clean so it cleans up binaries in bin dirs - remove unneeded print - fix for bug rbz 132680 - fix for bug rbz134282 (bad handling of port number specification) - fix problem that generated a circular dependency - change a log_msg to log_err, so that the error is printed even if the - ensure that the config_version is an integer - fix seg fault that occurs when querying ccs with desc that is too large - cluster mirror - forgot the uninstall script - add cmirror target - a little clean-up - make the "in_sync" flag log based (rather than module-based) this - remove some print statements and refine todo list - update example cluster.conf to reflect changes in tag names - updates for new cluster.conf tags - update docs to reflect tag name changes in cluster.conf - updates to reflect tag name changes in cluster.conf - updates to reflect change in cluster.conf tags - updated fence_devices/device, but forgot nodes/node - Add message (that prints when -v flag is used with ccsd) that - add log_msg_always macro, which prints a msg of priority NOTICE - fix bug 143165, 134604, and 133254 - update related issues - fix a problem where if the working path was set (via ccs_set_state()), - minor updates to ccs_tool - fix bug: 128422 (ccsd grabs network config silently if local is bad) - fix bug 128662, ccs_test connect <clustername>: connects to local - remove log_msg_always - add ccs_tool to Makefile - report better errors (especially in the event that magma plugins aren't - some cmirror changes that have been sitting around. - if a lockfile can not be created, print that ccsd is already running. - forward port bug fix 133254 - fix for bug #137021 ccs doesn't find most recent cluster.conf IF YOU USE CCS, YOU WILL WANT TO READ THIS: - sync with changes made to RHEL4 branch - ccs_tool now handles the update process. - warn every 30 sec (vs 10) if ccsd can not connect to cluster - add ccs_tool man page - fix bug with update when absolute path was specified. - better error reporting when there are multiple concurrent updates - fix the CCS archive -> xml conversion for the gulm section - ccs(7) not ccs(8) - fence_node has been change to work better with gulm (143487) - add -V and -h to ccs_test - missing break statement, thanks to Bastian Blank - replace an instance of ccs_get with ccs_get_list - commit man page changes from Bastian Blank - commit changes to dm-log_cluster.c before its break-up - break the cluster mirror file into separate files - fix senario where a server dying could leave clients with regions marked - commit changes so I don't loose them. - no cman on upgrade. - typos - remove cmirror target - fix SEG FAULT - Teigland's patch to make CCS skip clustering and just read the local - fix for bug 157094 - use umask so that permissions on /etc/cluster/cluster.conf are -rw-r----- - last commit before major changes... don't want to loose this. - Bring back up to sync with latest mirror changes - cluster mirror is working again, but requires patches to kernel - add cluster "core" log support - don't call complete() on failure_completion if we are in "core" mode. - do not call completion on every suspend, it increments a counter - Fix cmirror bugs - s/uint32_t/sector_t for get_region_size return - millennium latency fix for 2.6 - commit Lon's patch - endian fixes so heterogenious clusters can work WTR CCS - Make the init script do a cman_tool leave remove on stop. Restarts - ccs_tool can seg fault on upgrade bug 186121 - This is the beginning of the cluster mirror log rewrite. The - filling out client side logging implementation (patches sent previously) - cmirror is not ready to compile in HEAD lsnodes -> lsnode typo. Bug 236580: [HA LVM]: Bringing site back on-line after failure causes pr... People seem to think that they have to setup lvm in rgmanager even though they If misconfigured, HA LVM + mirroring can cause data corruption. We should Require vg_name to be unique. Allowing multiple LVs from the same VG <Previous check-in> file Makefile was initially added on branch RHEL5. file cmirror was initially added on branch RHEL5. file clogd.c was initially added on branch RHEL5. file cluster.c was initially added on branch RHEL5. file cluster.h was initially added on branch RHEL5. file common.h was initially added on branch RHEL5. file functions.c was initially added on branch RHEL5. file functions.h was initially added on branch RHEL5. file link_mon.c was initially added on branch RHEL5. file link_mon.h was initially added on branch RHEL5. file list.h was initially added on branch RHEL5. file local.c was initially added on branch RHEL5. file local.h was initially added on branch RHEL5. file logging.c was initially added on branch RHEL5. file logging.h was initially added on branch RHEL5. file queues.c was initially added on branch RHEL5. file queues.h was initially added on branch RHEL5. file dm-log.h-copy was initially added on branch RHEL5. file dm.h-copy was initially added on branch RHEL5. file rbtree.c was initially added on branch RHEL5. file rbtree.h was initially added on branch RHEL5. BUG 427377 s/validate/verify/ lvm resource script now allows multiple LVs per VG as long as they move - a regression... When tagging at the LV-level, the script should complain - Bug #428448 - ccs library now checks for bad file descriptors as input - better checking for improper setup - Bug 431705: HA LVM should prevent users from running an invalid setup (2) rgmanager/lvm.sh: Fix bug 438816 rgmanager/lvm.sh: Fix bug bz242798 rgmanager/lvm.sh: change argument order of shell command rgmanager/lvm.sh: Minor comment updates rgmanager/lvm.metadata: Fix parameter description fields rgmanager/lvm.sh: HA LVM wasn't working on IA64 rgmanager (HALVM): Stop dumping debug output to /tmp
Ken Preslan (151): HCH's suggested cleanups to the GFS mount/unmount code. Make sure the VFS ACL code is compiled in if GFS in enabled. Switch to using a fixed version of OGFS' shell sort. Bring patch uptodate with the sources. Change the way a switch statement works to prevent GCC from doing stupid Reordered munging of modes on inode create. Suiddir support suggested by anton@hq.310.ru. Update to 2.6.8.1. Helper scripts for "gfs_tool lockdump". Reimplemented the flock patch to work similarly to the new way plocks Munge. Rearrange some asserts. Patches for a kernel that supports the new plock interface. Fix s_maxbytes. Print out PIDs. Rework patch to fit with Trond's comments. Update patches. Ben Cahill's new man page. Get rid of some pool crap. Get rid of a couple more 2TB checks. o Fix the BLKGETSIZE64 ioctl for Linux 2.6. The -mm kernel that includes our flock patch. Dave pointed out that I screwed up plocks. Added "cmd" argument to lm_plock(). Great comments from ben.m.cahill@intel.com. Add a missing -Wall. Whitespace munging. This should be a fix for bug #126531. Do better asserting in glock_hold(). Make an assert be more verbose. A command line tool that's a bit smarter about reading a GFS filesystem blerg. Munge. Break compound assert into two. Fix a few bugs in the eattr/acl code. A patch from ben.m.cahill@intel.com to install the gfs_mount man page. More comments from ben.m.cahill@intel.com. Munging capitalization and other stuff. Print a message before, as well as after, trying to mount the lock o Support for immutable and append-only files with a lot of help from I killed a bunch of people once. o Fix bug #135249 with a lot of help from Dave Teigland. Ever since the Formatting. Don't align the glock hash buckets. The gfs_sbd structure is now about Stuff all of the assert message into the panic call. Update patch. Update to 2.6.9. (Lock_dlm and lock_gulm are broken until updated by Stop compiler warnings. Whitespace munging. Fix bug #126952 Fix minor permissions issue. More comments from ben.m.cahill@intel.com. o Fix bug #135684. Change code to make sure a RG is part of the "recent" Fix my stupidity. Add some locking to the recent list and forward pointer when tearing Formatting munge. o Fix bug #133368. Change the code to trigger calls to lm_cancel() on Get rid of unneeded file. o Fix things so a freeze call requesting the transaction lock can't Missed a use of the async flag that was just removed. Many more good comments from ben.m.cahill@intel.com. Clean up the statfs code some. o Clean up some memory allocation code Get rid of unneeded return value. When updating the atime, don't demote the glock back to shared unless More comments from ben.m.cahill@intel.com. Update. Fix a spot where we weren't propagating away errors. Declare the RO array in gfs_sort() staticly instead of dynamicly on the o Add "oopses_ok" mount option. This allows GFS to oops instead of D'oh. Missed some files. More excellent comments from ben.m.cahill@intel.com. Fix bug #141821. KNFSD could call into GFS to do a lookup on ".." of Got rid of some unused LM flags. Get rid of the kernel patch. Not a battle worth fighting. Sigh. Munge. TASK_INTERRUPTIBLE -> TASK_UNINTERRUPTIBLE. More comments from ben.m.cahill@intel.com. o Add facility to stop all block I/O from a filesystem by setting a Add code to implement "gfs_tool withdraw /mountpoint". Fix a case where the resource index isn't propery released on error. Fixed a reference leak of lock module in-use counts that was caused by Dave's changes to allow "gfs_tool [freeze|unfreeze|withdraw]" to work o Changed GFS ioctls so purely incore binary structures aren't used. Get rid of -P option. It should be replace with CLVM labeling later. Update patch. Add new file. Reworked gfs_printf(). Fix statfs bug. Profiling/tracing stuff. Incore printing / profiling / tracing library. Some 2.6.10 stuff. Update patches. Start using the new generic permission checking code in 2.6.10. Print a better message when versions mismatch. Copyright/GPL. Unbreak the user tools. Cleanup. Don't complain if we stall trying to make a log reservation. Don't complain about uninit()ing busy log buffers if we're shutdown. Change diaper so it doesn't return errors if the filesystem is already Don't let stale/invalid data leak up to userspace on a read() from a Fix a bug in a memset(). Probable fix for BZ#146711. When running the "gfs_tool <something> <mountpoint>" commands (e.g. Add some more fields to the lockdump print out. Forward-port Tadpol's 146711 fix. Bastian Blank bastian@waldi.eu.org's manpage munging. Linux 2.6.11. Compiling is always nice. Add "debug" mount option. Causes gfs_assert_warn() and gfs_lm_withdraw() Munged printk() ordering. Add a new "flags" parameter to lm_mount() of the lock module interface. Change it so a spectator has a 's' instead of the jid in its fsid. Fix precedence error. Start including gfs_debug in the build. Fix bug #154902: Rearrange order so it actually compiles. Let there be compilation! Quit yer whining. GFS2. Still a work in progress. But it should be faster. Remove some debug code. Rearrange some assignments to work with gcc4. Add some new block types. Fuzzy statfs(): Get rid of an osi_. Add back NFS support. Fix ACL leak. munge. Don't do write_inode if we're PF_MEMALLOC. o Honor "data=ordered" for truncates Fix broken makefile. Munge. Fix a misinitialization in the diaper device. Unlock the page when erroring out of writepage(). Clean up journaled data code and metadata I/O code. Add "gfs2_tool getargs" to get the gfs-specific mount arguments used to Fix problem of writepage() needing to map blocks at weird times when Update. o Add back code to support gfs2_grow and gfs2_jadd. gfs2_jadd and gfs2_grow. Fix a couple of rename() bugs. Refix a problem with the rename lock. Don't associate it with rename Fix bug #129468 Fix BZ158133. munge Fix BZ158133. o Changed the lookup so it doesn't need to take the new inode's glock. o Allow the appropriate FS-specific mount options to be changed on remount Fix errors on rebuilds. o Fix a bug I introduced that would keep GFS from replaying a journal on
Lon Hohberger (498): Replace missing clu_fence() API call. Fixes from Ben Marzinski Magma plugins API fixes for GuLM, CMAN, CMAN/SM Fix build problems; fix plugins to not use cml_free Fix build problems breaking open(/usr/lib/magma/plugins, O_DIRECTORY...) Fix wrong-side of ipv6 addr->node ID assignment Display node IDs as 64-bit hex Make dumb driver act like there's a real, one-node cluster. Rename variables/macros/structures to not use leading __ Fix breakage in SM plugin when querying group membership w/o logging in; install in /usr Cleanup of useless variable Remove old message encapsulation pieces and dependency on crc32.c Make ccsd give element name back with relevant cdata Fix mutex deadlocks around cluster lock/unlock calls Remove crc32.c & build req. on it. Initial checkin of rgmanager Change cflags to make it build on newer compilers for now Clean up build Temporary sledgehammer lock fix Unbreak gulm. * src/resources/*: Add status/monitor actions to metadata Fix build OCF Actions + Autostart param addition Periodic status checks, updated resource agents, misc. bugfixes dumb was requiring libxml2 to build for no reason Remove MAX_MSG_SIZE limitation Zero out struct sockaddr_in6 structures before using... Update tasks First pass at resource tree deltas (tested). Port from clumanager 1.2.x: potential fix for #114388 Remove MAX_MSG_SIZE limitation Add Doxyfile. Add lots of Doxygen-readable comments to code. More comments More doxygen stuff Remove references to handlers for RAs. They're not needed. Handle default value in RA parameters Use libdlm_lt with cman and sm plugins (removes dependency on dlm_pthread_*). Also make all the return value semantics match. Add cp_connect() function to library. Make cptester only use cp_* functions. Per Patrick's comments: open/create lockspace when first lock is taken instead of at init time. Make all the different libdlm targets use -fPIC during builds for proper symbol relocation on x86_64 Ensure we wait for the AST during the unlock as well as the lock. Simplify (1) Split up thread and non-thread libraries so applications which do not Take out unnecessary printfs. Change to match with new memb_lost/gained Change to match magma API changes (clu_members_lost->memb_lost) Fix bug feist was seeing during builds on x86_64 Copy changes to old cman plugin. Fix warning on x86_64 Fix x86_64 build warnings Add preliminary netfs (NFS) and clusterfs (e.g. GFS) resource agents. Removed. Replaced by clusterfs.sh/netfs.sh * Pass SHAREDIR to build. Prune tree when a child type is not allowed Remove mallocdbg files + refs. Give ourselves ways to free/clear out Remove mallocdbg references. Clear out references to mallocdbg. Remove mallocdbg files. Don't free old block if realloc fails No longer needed. Fix changes WRT bash 3. Use the 'ip' command instead of ifcfg for now. Clean up bad mallocdbg stuff Add support for the widely-used RPS-10M-HD modules. (2 node clusters ONLY) Use -n instead of -p for port number. Ignore case for options read from stdin. Actually handle -v argument. - Add support for "-o" option to control on/off/reboot. Add checks to make sure the nfs daemons are running. - Read configuration data before joining the service group Re-enable support for specifying target nodes for enable/disable/etc. Reflect jparsons's changes to cluster.conf structure Fix relocate on more-preferred-member-boot; fix minor bugs in building alloc.c Fix up circleping to work properly. - include/resgroup.h: Remove unnecessary states/requests. Add src/daemons/Makefile: Remove debugging compiler flag Display all cluster members if we're not part of the RG manager group. Unbreak init script Make script actually work. Fire wall ports are corrected, but the Fix segfault if rgmanager wasn't running Put in decent errno values for cp_connect so callers can find out what happened when it failed. Kill a bunch of assertions. Return EINVAL when we try to access functions from a nonexistent plugin Report a usable reason when clu_connect fails. Fix bug that mantis saw; we shouldn't log in with newlines Fix for 44945 Use libdir/magma for plugins Fix magma plugin names + installation paths to match LIBDIR/magma amd64 fix Fix build bug on slackware Include bonded-ethernet link detection file nfs-tests was initially added on branch RHEL4. Add make check, ra-api-1-modified.dtd for validating ra metadata Internal consistency check tests for rgmanager's tree/list. Fix memory leaks as a result of not cleaning up libxml2; move test functions into test.c Merge from RHEL-4 branch: internal test cases, make-check target, actually install man pages Clean up negative memory-leak testing Merge from RHEL4 branch: man pages, error documentation + indexing Index of error messages Merge from RHEL4 branch Use {libdir} for {slibdir} if no slibdir specifed on configure's command Use {libdir} for {slibdir} if no slibdir specified Clean up lock spaces so we can unload dlm module Kill clufindhostname; its functionality isn't used, and can be provided by things like host(1) Include rg name in clustat -x output fix #146924 Misc. fixes; see changelog. Add support for Bull NovaScale machines via ipmi-over-lan and PAP management console #149522 #150067. Part of #149735. See ChangeLog. Implement basic recovery policy handling Zero out cm_addrs when reading /proc/cluster/services for a member list Fix part 2 of #150079 Fix build problem. Add -h for clustat / clusvcadm Fix for multiple simultaneous leaves not being handled properly Misc. bugfixes Fix 150481, part 2 Change to arch-dependent char instead of uint8_t Properly cast Show stdin options with -h Clusterfs.sh / fs.sh fixups add resource rule printout to rg_test Fix 151095 Fix various bugzillas (see ChangeLog) Use service instead of resourcegroup as root resource to match the UI and user expected behavior / terminology fix warning add init.d to install set Fix timeout bugs Fix bonding link detection (port from clumanager 1.2.26) Fix options description Add NBB1600 support + fix IPS800[CE] support Fix GCC4 warnings msg_init isn't ready yet; remove for now Fix timeouts for 32-way bull machines Fix frim Birger Arbitrary resource tree patch Fix 157248 Fix file descriptor leak in services.c Fix fd leak, change resource-group -> service, fix node ID display Fix API change: we no longer get Logged_out after Fenced Bull PAP + Bull IPMI-over-LAN support fix targetted relocation bug when running with gulm Fix arg parsing Fix arg swap problem when reading from stdin Fix for example.conf Make magma_ucman deliver CE_SHUTDOWN when it should; make magma_tool ignore SIGPIPE so it can trap for closed sockets. Ask for node name instead of for the existence of children from ccs Don't assume child nodes exist just because someone asks for them Add patch from Frederik Schueler to remove implicit dependency on rdisc Add Patrick's initial fence_xen to fence Xen virtual machines. Fix bug in ip.sh which would match 10.1.1.1 as being the same as 10.1.1.111 Don't use _syscall macro, patch from Adam Conrad \Fix 162805\ Fix type causing verify-all to not work properly Apply patch from Eric Kerin to fix #162824, fix #162936 fix 157327 Fix 159637 Fix 163651 Fix 159767 fix 162501 fix 164627 Fix 167216 -- ip.sh script errors Fix Joe Orton's comments re: calling ld instead of gcc -shared... Ensure proper linking Add VMWare ESX server fencing from Zach Lowry Clustat partial rewrite (still needs updated XML output) Incorporate patch from 165447 Don't build clufindhostname; it's not even in the tree Add support for inheritance in the form "type%attribute" instead of just Mono-NFS server support (e.g. one NFS server per cluster, active-passive). Add logging library Mono-NFS server resource agent Fix #171253 Ensure rgmanager doesn't block SIGSEGV when debug is not enabled. Apply patch from Axel Thimm to fix bz172066 Fix bugs 172177, 172178 Fix rest of 172178 Fix #172401 Fix for 172441 from jparsons Fix #162605 Allow scripts to inherit the name attr of a parent in case the script wants to know it (#172310) Fix #165447 - ip.sh fails when using VLAN on bonded interface Fix #171153 - pass 1 - clustat withholds information if run on multiple members simultaneously Fix #171236 - pass 1 - ia64 alignment warnings Fix #173526 - Samba Resource Agent Fix #173916 - rgmanager log level change requires restart Fix #174819 - clustat crashes if ccsd is not running Fix #175106 - lsof -b blocks when using gethostbyname causing slow force-unmount when DNS is broken Fix #175108 - rgmanager storing extraneous info using VF Fix #175114 - rgmanager uses wrong stop-order for unspecified resource agents Implement 175215: Inherit fsid for nfs exports Fix #175229 - remove unneeded references to clurmtabd; it is no longer a necessary piece for NFS failover Fix #175033 - magma-plugins incorrect read behavior for /proc/cluster/services > 4096 bytes Bump SM plugin version Fix #175033 part 2 - read in page-size chunks from /proc/cluster/services Fix 176343 - __builtin_return_address(x) for x>0 is never guaranteed to work Merge ccsd local socket patch from RHEL4 / STABLE Add CLK_HOLDER flag: Tells the plugin to return an allocated uint64_t; #178024 - also fixes incorrect API documentation in man page Fix 178026 - provide lock holder on request from SM. Fix 178080 - implements work around dlm release 177934 by taking a NULL lock on lockspace acquire. Fix #166109 - random segfault in clurgmgrd. Fix most of 177467 - clustat hang (does not fix the case where the lock manager never responds to a request). Fix bug in smb.sh associated with ccs descriptors > 255. Fix #178249 - debug messages from gulm.so Fix broken build Fix 178249 - pass 2 Fix 179063 - some options missing from nfsclient option handler fix #179662 Agent fixes for #178314 Add simple scoring/disk-based quorum daemon Fix build on 64-bit arches Start of ripping out dependency on magma; builds but does not currently work Patch from Fabio Massimo Di Nitto: Fix includes Include missing .c files in src/clulib; remove defunct src/daemons/members.c Implements 'label' support for qdisk. Uses /proc/partitions for device Fix includes for build on ia64 - Make rgmanager actually do things. Fix missing/non-updated #includes Patch for in-tree builds from Fabio M. Di Nitto. <fabbione at ubuntu.com> Fix #198406 - lack of ipv6 support in clufindhostname.c Add missing xenvm.sh resource Fix licensing information in resources *** empty log message *** Add preliminary live-migration support (e.g. for Xen for FC6 Fix typo in Makefile Add man pages for qdisk fix 200449 - status checks wrong * src/clulib/ckpt_state.c: Preliminary implementation of replacement Fix parameter ordering for calling cman_send_data_unlocked Fix relocation & transition handling Apply Navid's patch to -head 2006-08-18 Lon Hohberger <lhh at redhat.com> Fix 200776 - mixed up default log level constants 2006-08-21 Lon Hohberger <lhh at redhat.com> 2006-09-01 Lon Hohberger <lhh at redhat.com> Apply resource-instance-name.patch Fix failed->disabled state transitions; #208011 Fix various bugs, incl. 208011, 203762 Apply patch from Fabio M. Di Nitto to fix clustat service name expansion bug Clean up build Fix 202498 Fix segfault due to missing param Fix #208577 Implementation of client/server based Xen Virtual Machine (xvm) fencing. Add --enable_xen configuration option (off by default), and make sure -V flag works for fence_xvm[d] Fix #208115 Fix #202497 Fixed 202492, not 202497... Ancillary patch to fix 202492 and actually add back groupmember attr, not just rgmanager (per-node) attr Fix #209544 - umount failing on gfs/nfs services Updated xenvm resource agent Compatibility fix for resource agents between linux-cluster and linux-ha Fix 202637 - error reporting missing from some agents Roll back patch to clusterfs.sh Roll back patch to resrules.c Fix #211701 (rgmanager + clustat hangs), #211933 (xenvm rename -> vm) Fix #212074 Update Changelog Apply patch to fix build on newer kernels from Fabio M. Di Nitto <fabbione at ubuntu.com> Fix bugzillas #212444, #212433 Fix bugzilla #212474; fully integrates fence_xvmd with ccs & the cman init script Fix error reporting from cman if run while xend is not running. Fix #213218 Fix #213878 - segfault in rg_thread.c due to improper loop semantics Fix bug reported by Fabio M. Di Nitto - duplicate definition of assign_noccs Fix bug where fence agents were getting info up to groupd Handle 0.1.9 case of libvirt returning a virDomainPtr + state for a VM that doesn't exist (vm state == VIR_DOMAIN_SHUTOFF) Fix build error Fix segfault in clustat if node is not a cluster member Fix #211468 - clustat always returns 0, but should give a nonzero code for non-running services. Fix #216774; missed rg_thread.c Fix #216774 Fix #216774, pass 3 Implement cap on max # of outstanding status check threads; fixes bugzilla #218697 Resolves: #221210 Fix bug causing cluster.conf / rm log level to be ignored in resource agents Apply patch from Simone Gotti; fixes #222744/#222838 Resolves: #222485; patch from Simone Gotti Fix #222961 - required for Conga to work. Resolves bugzillas: #213533, #216092, #220211, #223002, #223234/#223240 Simple manual override for fenced & example replacement for fence_ack_manual Use /proc/uptime by default instead of gettimeofday(2) for internal timings to avoid problems when the clock is reset by NTP Patch from Fabio Massimo Di Nitto - Fix portability of getuptime function Fix #222484 merge fixes from RHEL5 branch Clean up test cases Add override for action timings Add list_prepend macro Port fix for logging of errors in config from RHEL5 branch Fix 223519 Add error reporting if msg_open fails; patch from Josef Whiter Don't query rgmanager if the user only wants a node state Apply fixes from RHEL4 branch Remove fence_manual; only provide manual-failure override Make service.sh understand lvm RA type Add RA installs to trunk; Make sure utility stuff is installed in the right place Add member_util.sh functions Add member_util.sh to installation Fix missing copytobin target for RHEL4 branch Add LVM failover agent; by Jon Brassow Fix 229254 - extraneous man pages, 228823 - allow disable of services stuck in 'stopping' state Initial checkin of simple dependency engine Add missing comment Check in missing header Remove ancient / unused script Add example test configuration for dtest Resolves: 229338 Resolves: 222445 Fix anonuid/anongid parsing in nfsclient.sh Make status checks happen at 'start' time (parent-before-child) instead of 'stop' time (parent-after-child). Resolves: 231151 Fix missing newline in debug message Add open failure message Fix 213241 Fix help message Strings cleanup. Enable vm.sh live migration. Apply build cleanup patch from Fabio M. Di Nitto Force release of lockspace; patch from Patrick Caulfield Fix clean target; patch from Fabio M. Di Nitto Fix multimaster bug: ensure timings are accurate and provide multi-master conflict resolution Use more strict build options Merge ordering patch from RHEL4 branch; update automated test cases Remove dead code; fix build_tree loop Merge patch from Crosswalk team Fix SPARC / HPPA build; patch from Fabio M. Di Nitto Kill VM machine immediately; patch from Jeroen van den Horn allow ocfs[2] to work with the clusterfs resource agent. Also, commit patch which corrects interval processing for status operations Make agents more OCF (Open Cluster Famework) compliant Fix watchdog race on rgmanager exit; BZ#236204, patch from Andrey Mirkin fix depends.h/depends.c Add obvious requirement on shared resource case as suggested by Simone Gotti Fix dtest.c compile errors Cleanups to make the resource agents behave better (return OCF_NOT_RUNNING, for example) Apply patch from Andrey Mirkin to fix 237144 Apply patch from Simone Gotti to fix logging errors in clusterfs.sh Fix bug 234589 Fix #231521 Re-fix #222484 Add patch from Simone Gotti to implement service freeze/unfreeze. Add simple buffer handling for later use. Fix corner case reported in #212121 Add test case from RHEL4 branch Apply patch to fix bugzilla # 232140 Add SAPInstance and SAPDatabase resource agents to HEAD Readding SAPInstance/SAPDatabase Add SAP agents; resolves #238916 Make manual fencing's command line parser backward compatible; per dct Fix typos in resource script logging Add missing primary keys to SAP agents Update Fix 234249, 229650 Fix #229650 - part 2; fixes an uninitialized var problem Fix status check Fix type size for 32/64-bit mixed clusters Fix #243691/2 Fix update failure if node was fenced Make exclusive resources work again Fix missing label Fix full-virt rebooting (#243872); add local-only / no-cluster mode to fence_xvmd Ancillary patch to fix debug output Clean up testprog in make clean Make lan+ work if built as a STONITH module Remove testprog target. Merge from RHEL5 branch Add note to usage.txt for configuring on 64-bit environments Fix #237144 - pass 2. All testcases accounted for now. Resolves: 247488 Misc. bugfixes; see ChangeLog Fix #249758 Fix #250152 Fix bug #248727 Fix build problem Fix #248727, round 2 Fix uninitialized var Fix #229650, pass 3 Fix #258141 - possible use after free in fenced Make fence_xvmd read options from ccs like it should; merge dbg_printf patch from RHEL5 branch Make fence_xvmd read options from ccs like it should Include missing debug.h header file Merges from RHEL5 branch - round 1. Merges from RHEL5 branch - round 2. Merge from RHEL5 branch, pass 3 Add centralized S/Lang event script engine v0.8.1 Merge force-unmount from RHEL5 branch for netfs.sh script Preliminary GFS2 support in clusterfs.sh Add missing sets.h Fix format warnings on newer GCC Make S-Lang library & include paths configurable. Fix type-punned errors on i386 Add missing ds.h Misc. minor central processing bugfixes Add return value for inability to run due to exclusive flag being present Fix misc central events bugs. fix typo in clusterfs.sh Fix #254111 - when stopping a service using a shared GFS resource, it umounts it even if other services are using it. Allow soft dependencies when central_processing is enabled Fix endian issue on big-endian arches Correct signed vs. unsigned comparison on sparc64 Figure out where slang is installed. Fix build problem reported by Chris Feist Roll back previous patch to ip_lookup.c Fix #60 error in #428346 bug file oracledb.sh was initially added on branch RHEL5. fix 429248 Fix ccs connect error handling Unblock signals after fork() so heuristics using signals don't hang Fix #430272, #430220 Fix qdiskd master abdication logic (#430264) Make default TTL 4 instead of 2 per Fabio's recommendation (e.g. RFC2608). Make TTL configurable in cluster.conf/command line for fence_xvm. Make fenced's override wait time configurable. Fix short read handling in read_pipe Correct incorrect netmask handling in ip.sh * Make fence_ack_manual.sh accept -n Fix #435189 - fenced override doesn't allow rgmanager to recover because Merge branch 'master' of ssh://lhh@sources.redhat.com/git/cluster Add Sybase failover agent Update changelog Add / fix Oracle 10g failover agent Merge branch 'master' of ssh://lhh@sources.redhat.com/git/cluster [fence] Make fence_xvmd support reloading of key files on the fly. [CMAN] Fix "Node X is undead" loop bug [rgmanager] Don't call quotaoff if quotas are not used [rgmanager] Make ip.sh check link states of non-ethernet devices [rgmanager] Set cloexec bit in msg_socket.c [rgmanager] Fix #432998 [rgmanager] Remove unused lockspace.c file [cman] Merge scandisk & fixes from RHEL5 branch [cman] Fix qdisk Makefile / disk_util merge bugs [cman] Make mkqdisk print all device paths [cman] Apply missing fix for #315711 [cman/qdisk] Fix type pun errors in proc.c [CMAN] Make cman init script start qdiskd intelligently Revert "[CMAN] Make cman init script start qdiskd intelligently" [CMAN] Make cman init script start qdiskd intelligently [fence] Preliminary TPS/NBB/NPS support in new WTI agent. [fence] Close file descriptors that are in invalid/error states Remove clushutdown man page references from clusvcadm.8; resolves #324151 [cman] Close sockets in error state in gfs_controld / dlmtest2 / groupd test [rgmanager] Fix #441582 - symlinks in mount points causing failures [rgmanager] Apply patch from Marcelo Azevedo to make migration more robust [rgmanager] Fix live migration option (broken in last commit) [rgmanager] Use /cluster/rm instead of //rm Fix #362351 - make fence_xvmd work in no-cluster mode Ancillary NOCLUSTER mode fixes for fence_xvmd Ancillary NOCLUSTER mode fixes for fence_xvmd [rgmanager] Make rgmanager check pbond links correctly Revert "[fence] fence_xvmd: Add KVM support; misc cleanups." [fence] fence_xvmd: Add KVM support; misc cleanups. [rgmanager] Fix erroneous broadcast matching in ip.sh [fence] Port XVM to logsys [fence] Fix XVM's debug.c default [fence] Make fence_xvm[d] use normal log levels [rgmanager] Add optional save/restore to vm resource [qdisk] Make stop_cman="1" work if heuristics fail during initialization [rgmanager] Fix resource agent metadata and un-break 'make check' target [rgmanager] Re-fix permissions bits broken in last commit rgmanager: Ancillary fix for rhbz #453000 cman: Fix qdiskd file descriptor leak cman: show '-d' option in mkqdisk -h and mkqdisk.8 rgmanager: Make freeze/unfreeze work with central_processing rgmanager: Detect restricted failover domain crash rgmanager: Permit careful restart w/o disturbing services rgmanager: Wait for fence domain join to complete rgmanager: Fix up clusvcadm.8 manual page to show -M option rgmanager: make status poll interval configurable rgmanager: Clean up build rgmanager: Implement enforcement of timeouts on a per-resource basis rgmanager: Make clustat and clusvcadm work faster rgmanager: Resolve hostnames->IPs and back when checking NFS clients cman: Fix broken qdisk main.c patch reverted with scandisk merge cman: Don't let qdiskd update cman if the disk is unavailable rgmanager: First pass at port to logsys group: Allow group_tool ls <name> <level> to be scriptable rgmanager: make clulog build even though it's incomplete rgmanager: don't change the build target just yet [fence] Fix fence_xvmd trying to read wrong args from ccs [fence] Make fence_xvmd "reboot" work with newer versions of libvirt qdisk: fix block size check fence: Fix bug in make_args() liblogthread: Fix sefault if fopen() fails for any reason rgmanager: Nuke clurmtabd since it's not used/needed rgmanager: Rename clurgmgrd -> rgmanager rgmanager: Enable stderr logging when run in foreground rgmanager: Use CCS again instead of building everything NO_CCS rgmanager: Fix debug build error rgmanager: Handle notifications from cman for config updates rgmanager: Remove polling code; misc cleanups rgmanager: Put init_resource_groups prototype in one place rgmanager: make clulog accept "-" as the first char in messages rgmanager: Avoid status checks during reconfiguration qdiskd: Always use O_NONBLOCK when writing to status_file qdiskd: Process reconfiguration events from CMAN rgmanager: make max_restarts work w/o restart_expire_time qdisk: Make online reconfig actually work qdisk: Update man page. Nuke crc32 code and use zlib. qdisk: Remove antique #ifdefs for old kernel-mode CMAN liblogthread: work with stderr etc qdiskd: Misc. cleanups, esp. loop cleanups in main.c qdisk: More misc cleanups. qdisk: Allow old logging style until next release rgmanager: make dtest compile rgmanager: Include follow-service.sl in the install rgmanager: Fix license for follow-service.sl logthread: Make multiple init/exit calls work logthread: Add missing prototype fence_xvm: Use new logging config parameters rgmanager: Fix up logging, part 1 rgmanager: Part 2 - flip logging function name gfs2: Fix handling of mount points with spaces
Marc - A. Dahlhaus (1): [MISC] Add version string to -V options of dlm_tool and group deamons
Marek 'marx' Grac (56): Fix #203720. Do not run backup copies (ends with ~) of resource agents. Bug #204054. Adding MySQL resource agents and utilities which will be common for other RA. Bug #204057. Adding Apache resource agent and utility which parse httpd.conf. Minor changes. Bug #204060. Adding OpenLDAP resource agent Simplifying scripts: typing error PID files are stored in common directory. Name of the PID file is generated from the OCF_RESOURCE_INSTANCE. Resource agents for Apache, MySQL and OpenLDAP are updated. Bug #204058. Adding resource agent for PostgreSQL 8 Adds possibility to add command line options to Apache RA. Names of variable in RA's metadata are changed to unify style. Adds possibility to add command line options to MySQL RA. Names of variable in RA's metadata are changed to unify style. After upgrade to 'unified names for PID files' we can clean code a bit. Adds new function (generate_name_for_pid_dir()). Minor update of messages texts. Adding Samba resource agent (tag <samba>). We already have resource agent for Samba but this is written in the same way as the other application's RA (mysql, apache, ...). Old-style RA stays available (tag <smb>) so it won't break backward compatibility. Add check if the instance of RA has parent (variable service_name) Some application needs time until they stop all theirs processes, so we have to wait a few moments until main/parent process is finished. This patch adds an option 'shutdown_wait' for each application's RA. Test if PID file of the application points to running PID. If not then this PID file is deleted and application can start. This patch pushes generated configuration files for service in /etc/cluster/ (RA_COMMON_conf_dir) where each service (samba, openldap, ...) has it's own directory. In this directory is another directory with instances (OCF_RESOURCE_INSTANCE). Bug #204784. Adding Tomcat resource agent Script for parsing Tomcat's conf/server.xml Bug #213524. Resource agent for named + patch for stopping applications Bug: 212479 - ip.sh causes /sbin/ip to produce warnings New flag -F for clusvcadm to respect failover domain (#211469). Also changes clusvcadm -e service00 which enable service on local node and do not respect failover (same as in RHEL4, in RHEL 5.0 it just wrote Failure). Resolves: #245178 - install RA for named (agent already in CVS) Resolves: #250681 - mount samba share from netfs RA fence/agents: New fencings agents fence/agents: WTI agents merged fence/agents: Add obsolete options [RGMANAGER] Fixed typo in mysql.metadata [FENCE] SSH support using stdin options [FENCE] Fix #435154: Support for 24 port APC fencing device [FENCE] Fix name of the option in fencing library [FENCE] Fix problem with different menu for admin/user for APC [FENCE] Fix typo in name of the exceptions in fencing agents [FENCE] Fix #248609: SSH support in Bladecenter fencing (ssh) [FENCE] Fix #446995: Parse error: Unknown option 'switch=3' [FENCE] Fix #447378 - fence_apc unable to connect via ssh to APC 7900 [FENCE]: Fix #237266: New fence agent for HMC/LPAR [FENCE] Fix #446995: Unknown option [FENCE] Fix: 447378: fence_apc unable to connect via ssh to APC 7900 Fixes #445662: names of resources with spaces are mishandled [FENCE] Bug #448822: fence_ilo doesn't work with iLO [FENCE] Fix #448043 - Update man pages for fence agents [FENCE] Fix #237266 - LPAR/HMC fence agent [FENCE] Fix #460054 - fence_apc fails with pexpect exception [FENCE] Fix #290231 - "Switch (optional)" param does not default to "1" and program fails [RGMANAGER] - Fix #462910 postgres-8.sh and metadata fixes [fence] Operation 'list' for APC fence agent [fence] Operation 'list' and 'monitor' for iLO, DRAC5 and APC [fence] Operation 'list' and 'monitor' for WTI IPS 800-CE [fence] WTI should not power on/off plug if it is unable to get status [fence] WTI should not power on/off plug if it is unable to get status [FENCE] Support for long options (eg. --ssh, --help) [fence] Extension to fence agent for BladeCenter with 'list'/'monitor' operation [fence] Extension to fence agent for LPAR/HMC with 'list'/'monitor' operation [FENCE] Support for 'metadata' option for fencing agents / default values
Mark Grimme (1): rgmanager: Add follows-service script
Mark Hlawatschek (2): mount.gfs2: skip mtab updates rgmanager: Update SAPInstance / SAPDatabase to current versions
Michael Conrad Tadpol Tilstra (111): - this is the idea I have for having gulm use ccs. This code here - fixes bug #126970 - removed the unneeded utils_verb_flags stuff from lock_gulm.ko - <can't say anything nice> - CFLAGS in make/defines.mk is ignored if you don't use += in the Makefiles - reshaped the way i was building the lock key names. This new method - gulm will now get its configs from ccsd. - decent start of documenting cmdline args and config file for gulm. - this is the opteron stack bug fix here too. - fixed a few ickys. - The first 'half' of range lock support for gulm. Doesn't actually do - turned off a couple debugging things. - found some stuff I forgot to remove. - use safer naming for the dump files that go into /tmp. - temporarily remove ccs from gulm. Its way not working. Need to figure - cleaned up the makefile as per aj req. - moved the option parsing code to where it was a while ago. I didn't make - from the top, its src/lock_gulmd and src/gulm_tool now. - from the top, its lib/libgulm.a now. - added --use_ccs to man page. - added --use_ccs to --help - The first pass at getting range locking for gulm. This is server changes Basic posix range locks work in gulm now. Lots of changes required to - these build again. (forgot to update them with library.) - fixing an ism of gulm's gfs-only past. There was not a way for - update the gulm lib to reflect new quorate info. - have magma use the new quorate info. - removed some code that wasn't called anymore. - removed unused function. - changes to match last change in userspace. - must remember to decrement holder count when removing holders from - fix inaccuracies in the counters. *sigh* - fiddled with a few comments. - try to be a bit clearer about how you can name thigns in the servers - the kludging required server side to get something that posix can - the kernel side of things for getting the posix range F_GETLK cmd. - was double decrementing the expired holders counter. - Fixed tiny memory leak in config parsing. - Initalize entire sockaddr6 for binding. - don't logout on kill -TERM - plug leaky memory. - don't return positive error codes on loc module mount failures. - VFS does weird things with the error results, so before we try to - get builds with --prefix working. - This removes the limit of 300 jid mappings. Here's a patch to fix a gulm_tool bug that allows a user to enter the - stop printing out lock ids in different formats. its confusing. - i know they're unused, but it hurt my brain less if they're atleast - Added command to gulm_tool to query the services that have connected Don't requeue front on conflict with priority flag. - Updated lcok_gulm for the 2.6.9 kernel. - update the patch. - sometimes when starting multiple gulm server for the first time, - stop multiple local jid list scans on the same fs from trampling - removed node locks. The things they worked around in the past do - If a node remounts on the jid while we are replaying thier journal, Fixed some queue jumping. If the lock was held shared, and a new Fixed an end case weirdism. Allow gulm services to lock core. Locked core will ignore the Fix __you_cannot_kmalloc_that_much by switching the large kmallocs to By default, gulm determins the name of the machine it is on with - get lock_gulm.patch uptodate with source. Fix lock_gulm.ko for nonSMP kernels. Silly me, spinlock_t has no size The compiler is generally better at knowing when to try to inline if a plock IS_SETLKW, then it is a *blocking* request. switched = to += so I can add paths to LIBS spell generation correctly in prints. add servicelist to man - SIGTERM is ignored. Update man page describing this. - lock tables are supposed to register themselves are LT%03d, not %03d code to allow expiration of a subset of locks. This will be used by Most of the witdraw code. Still lacking a bit to notify other updated the patch for last lock_gulm.ko commit. moved a message from Always to locking. So here is a bit of code that finishes up the withdraw support for Wasn't calling ccs_disconnect() when reading config from ccs. I ran into some weird race with having gulm load info from ccsd more previous method of not calling out to ccsd more than once had issues Added a warning for if you didn't --cluster_name before you --use_ccs. added a way to query gulms current running config from gulm_tool print node name and ip on the protocol mismatch error message. fix for bug 144909 stop gulm_tool from printing its errors to syslog. gulm will now parse the server list from ccsd as use lockserver instead of server. less confusing. bleh, forgot man page. grab more of the login request for services so if the protocl version relocated some tags to attributes. Easier for the gui to work and stop being wavery on where tags go. Forgot to update gio. fix for bug 146479, was not properly pushing posix range lock pending plocks on gulm can now be interrupted. Handful of little cleaning things I ran across hunting a bug. Partial workthrough of log messages. Some moving, some changing. Removed a bunch of unused global vars. (well, externs to atleast) put the info that i thought was in the man pages in there. removed an unused func. is deprecated, not depercated If a local client logged into gulm, and the client got the nodelist reduced things down to a single ASSERT. passing back error codes and Was storing entries under the same name, this prevented withdraw from soft link libs for lon install symlinks to the .so fixup libs gcc4-isms fixup man page. Added some diagnostic messages for when clients/slaves cannot connect gcc4 hates me. same sort of compiler fixes that i did in userspace. added a warning.
Nicolas MONNET (1): rgmanager: Make postgres-8.sh use su instead of sudo
Patrick Caulfield (669): Create /dev/misc if it doesn't exist. Some older systems need explicit -lpthread Make sure that the backout time for a "NEWCLUSTER" message is less than the joinwait time. Add in more architectures. Fix a couple of 64bit compiler warnings. Add support for more VMS-like locking mode where new locks will not be granted Tidy atomic decrement. Fix assertion. Add timeout feature for resdir entries so they don't go away as soon Stop users calling dlm_[ls_]_pthread_init() more than once. Fix an obscure but potentially nasty race on the resource directory. Remove resdir sequence number from all structs as it's no longer used. Missed a couple of files from the last tidying commit Make DLM_LSF_NOCONVGRANT the default for userland lockspaces. Clean up the temp nodeIDs list at shutdown. Fix some small odd bugs in startup conditions: Validate nodeIDs oops, remove annoying (and badly formatted) debug prints If there's anything left to read after recvmsg, then make sure we get it. Tidy up accept path a bit. Print error if no nodeIDs are passed to "cman_tool kill" Man page for cman_tool Install man page Fix race where two reads/accepts could arrive in quick sucession but only the Don't clear the temp_nodeid until we've /really/ finished with it. Tidy up temp nodeids after a transition. Distinguish between not being able to get the cluster ACTIVE state, A couple of endian fixes. neaten up some bits by calling send_kill() rather than building up our own Get rid of zero-initialisers as the compiler (or is it the linker? Pass the unqualified name into the kernel, so the cluster node name never Fix race on islistening queries. Clear the "cluster_is_quorate" flag at shutdown Do the parameter overrides correctly. Remove some redundant code. Always copy AST parameters, even for converts. Always copy ast bits, even on convert (kernel vesion). Change the initialisation order, so if we get into the kernel twice by some Example init script for cman/dlm/fence/clvm Replaced setsockopts with ioctls. Use new ioctl interface to start the cluster. The "lets get all the API changes done in one day" checkin. Userspace side of last checkin - sync routines for locking. Set return code if we can't leave the cluster. Return error if we can't start all the listening sockets. reinstate copy of lock name that got lost in the last change. Use re-entrant gethostbyname2_r as we are juggling two hostent structures Fix a couple of error paths so they set errno. Put some locking around the lockspace list. Don't cleanr li_flags after setting the FIRSTLOCK bit Don't deliver blocking ASTs to locks that are in progress. Don't grant locks that have are waiting for unlock. If the dlm_unlock() call fails, the put the lock back on the ownerqueue I think it would be a really nice idea to actually pass the Don't update the bastaddr/param for a convert until the conversion has actually Knock the "I'm not getting out of bed for a message that small" size down Remove duplicate check. Fix userspace LVBs (thanks, Jeff) Don't hang lkbs off the ownerqueue list as we don't have any control Create /dev/misc with reasonable permissions. Change AF_ number to 30 so it doesn't conflict with bluetooth. Tidy LVB handling Add PID to a lock which can be returned using the Query API. Also include the pid in LKB rebuild Update libdlm doc. Setting a bad example... Use seq_file for /proc/cluster/services so we don't crash the kernel Return the lockid to userland as soon as we know it. Run dlm_recoverd only when we need to do recovery. Be a little less ACK-happy during transition. ACKs can now be embedded in data use REPLYEXP for ISLISTENING messages pack the comms structures. They are already well-aligned but the compiler Get rid of some debugging info that is no longer relevant (or even compiles). Fix stupid cut & paste bug with IPV6 multicast. Don't send HELLO as soon as the kthread is started as that might b e too early. A couple of endian fixes. mixed-endian clusters now seem to So /that's/ why it was up there... Rename "local_nodeid" so it doesn't clash with kernel's internal use. Change cman_tool kill to take a node name instead of a node number. Improve (I think) the usage message. Well, it's more verbose anyway. Tweak the way NEWCLUSTER works in an attempt to prevent the splits that Fix a subtle bug in the node IDs code where a node could get a different Don't get blinkered when in NEWCLUSTER modes...there may still be things out Free cluster ref if we fail to start. Clear the use_count if shut down with "force" Make all node removal happen in remove_node(). This has the nice Make sure we wake the membership thread before waiting for it to complete There is a small chance that rem_node could be NULL when passed into Allow a bit more flexibility in how nodes are specified in cluster.conf Remove distracting comment to which the answer was "no, it doesn't" Remove ourself from the waitqueue when told to quit. Add support for assigned nodeids so that people can have permanent node IDs Add support for allocated node IDs, both on the command-line and from CCS Be a bit more paranoid about creating the DLM device node. If it doesn't Add (untested) SELinux support Tidy. Fix that deadlock that's been there for ages but only ever pops its head Always return non-zero exit code if we hit an error. Yet another one of those "I can't beleive we've not seen it before" Use C99 initialisers. Also include module reference in file_ops. Fix bug in debugging routine. Add module ownership to various structures so we don't get unloaded whilst Free unused direntry structs when releasing a lockspace. Only stop recoverd if it is running Tidy daemon shutdown so that membership is always responsible for Print "left cluster" reasons as text. Make /proc/cluster/status a bit more consistent in its output, Remove state REMOTEMEMBER as it's not been used for ages. I've started the the kernel's interface numbers start at 1, not zero. So we need to Get rid of REMOTEMEMBER Wait for membershipd to shutdown before starting to clean up. Make sure the nodeids array is increased enough to cope with very Sanity check the nodeID Don't say "we are leaving the cluster" when we never actually join it. Honour the wanted_nodeid when we are the first node in the cluster. I am an idiot Kernel check for max nodeid too. in case anyone bypasses cman_tool Though Id checked this in last week. JOINING timeout should go back to JOINWAIT rather than giving up. Update highest_nodeid at the client-end of a transition too. There is a tiny Tidy the close_connection() routine. in addition add a parameter telling Forgot to set node_state when a node is killed via STARTTRANS Close the lowcomms sockets during recovery rather than as soon as Quit rx loop if the thread is closing down. Don't free "othercon" connections when they are closed as they might still be Move pingtest into userspace and check for invalidated LVBs Open the default lockspace if dlm_get_fd() is called before any other locking Simple LVB test prog Fix lock modes constants that were incorrect. Dear, oh dear this was out of date. Quick overview of the libdlm function calls. add DLM_SBF_VALNOTVALID so that the examples compile again. Improve support for PERSISTENT locks, and make the lkid checking a bit A simple DLM "hello world" example from Daniel. Compile two DLM libraries, one that needs pthreads, and one that doesn't Add a loopback to the comms layer so that clients can send messages to Changed the protocol header to include the source port number so it can be Add lock ORPHAN state, and associated query. Don't call wake_astd when we're doing a remote unlock. Add some (optional) stats collecting. Revert 1.40 as it causes astd to spin for some reason I haven't fathomed yet. Only wake astd when there is work for it to do. Close any created listening sockets "listen_for_all" fails Don't deref freed skb. Remember to free connections[0] if initialisation fails. Add priority to cman sockets warning - large checkin. Tidy the userland API so it takes nodeid 0 to mean "us" Add userland API to get the cluster name/ID First cut of a libcman - comments welcome Example of using libcman Add some locking around queue traversal. Keep a local copy of cbinfo->isoob as cbinfo can be freed before reaching the lowcomms_close can be called when atomic, so we can't use nodeid2con Only overwrite the user's LVB if the lockop has changed it Move lowcomms_close() outside of the spinlock, as it may want to sleep. Tidied up the userland/kernel interface so that all the data transfer happens Remember to assign parent during RSB rebuild. Don't try to reconnect when we get EOF on a socket. When comparing node states, "JOINING" is effectively the same as "DEAD", so Byteswap (if appropriate) new rl_flags field. Return the LVB "INVALID" state to users when the LVB has been invalidated. Fix refcounting error. Decrement the module count if returning -EEXIST from dlm_new_lockspace(). Get rid of suprious ASSERT in dlm_unlock that broke cancellations. Add a flag to userland lockspaces that will cause them to be deleted Make the default lockspace AUTOFREE so that it gets deleted when Udev script (goes in /etc/udev/rules.d) for creating DLM devices. use GET_ALL_MEMBERS to return number of nodes in the cluster as it's consistent Make the API a little more consistent. nodeid==0 always means (this node) Undo last check-in for this file that came from a bogus tree. Fix debug print Fix broadcast. Add a compatibility layer (conditionally compiled in) for using a 32bit Fix some comments Change dead_node_lock to a spinlock and don't hold it for nearly as long. Clean joining_nodeid in a few places. If a nodes dies after beiung sent a JOINCONF then remove it from all nodes Tweaks to init script Remove some redundant code. Make command-line options override CCS rather than the other way round. Tidy the language, fix a few typos, and update it a little. header->flags needs to be byteswapped since I made it an int. Tidy up the node_id assigning code. With the changes made a while ago, a lot Clear joining_node after a client-end transition Make sure transitionreason gets set if inherit mastership from a dead Add send/recv API to libcman, and some comments to libcman.h Don't send a KILL message if a node has the wrong generation number, try to lkb_dequeue s/b res_lkb_dequeue Don't use a large(ish) static buffer for the membership state. Fix a few join related bugs. Many of which are related to #142853 and # 1335212. If a transition gets usurped by another node, always tidy up old joining Be a lot quieter on the console Don't hold the res_lock for quite so long. Make MAX_RETRIES /proc settable. Use sock_create_kern() rather than sock_alloc() for 2.6.10 NULL some pointers when we shutdown. Don't loop ourself to death if we run out of memory. Move find_minor_from_proc before it's first use. If bind() failes then NULl con->sock too, so the tidy up doesn't cause an oops. Make all nodes print a message saying why another node left the cluster, so Clean the queued_messages list at shutdown. If we get nominated as master, remove any joining node we may have. If the cluster gets down to 1 node and the last leaver left with "remove" Don't commit with debugging enabled. ! Add SAF AIS lock API support Sanity-check the votes, so that expected_votes doesn't get silly. Add wait options to cman_tool to help with script synchronisation. Remove some redundant code. return error if ioctl(GETNODE) called before we are a cluster member. Check for valid LKB in find_lock_by_id() rather than later on when we've Another remove_joiner() needed - this time if the new node does not respond Set threads to SCHED_FIFO scheduling policy. Remove param from remove_joiner() as that part of the patch hasn't Return an error from several ioctls if they are called when we are not part of remove_joiner() now also informs the poor node that it's join has been Grr, got me patches & cvs all mixed up again. If we get an old STARTTRANS(REMNODE) then still remove the node from our Split the removal of a node out of STARTTRANS as they can get Put some more validation on integers passed in from the commandline. Change a BUG() into a printk, it's not really /that/ serious an anomoly. Don't call nodeid2con() if we're shutting down, it might allocate a new con. Don't starve processes that are filling buffers. If we get a position JOINACK then ignore any negative ones that come afterwards Make heartbeat thread exit whren "quit_threads" gets set. This also mean dlm_release_lockspace uses force==1 by default, so the LS gets Use #defined constant rather than a plain number Default for releasing userland lockspaces is "1", ie get rid on any master If a joining node is removed by the time it has become a provisional member Check quit_threads in a few more places so we don't get blocked use $(CC) for linking. Be a bit smarter about when to schedule() when reading lots of data Add a -w (wait) option to "cman_tool leave". man page for cman_tool leave -w Don't start the transition timer if we're doing the shortcut single-node Display node name in /proc/cluster/status On sparc & s390 do biarch checking at runtime. This paves the way for Remember to tell SM if we get down to one node. This fixes the socket leak in the case where a primary connection was Deal with failed joinconf more sensibly. (that message may have to go though) Display "mantis-friendly" membership state in /proc/cluster/status I was trying to be too economical with code in cman_tool wait, Remove kjoin.c as it's not used and won't even compile any more. Put some locking round membership_task so we don't try to wake up Add -t option to join/leave/wait, specifies the maximum amount of Add ioctl32 support. Need this too. Get rid of spurious "up" in barrier error path. Don't send sequence number of zero, it causes trouble. Set the socket priority to INTERACTIVE to ensure If we are a new master, don't try to rejoin an old node. Replace the array of connections with functions from linux/idr.h Tidy printks Set close-on-exec flag on DLM file descriptors Clean transitionreason after a state transition has finished. Fix dependancies for join_ccs.o lowcomms for the new src2 DLM. Don't try to add too many addresses. Remove unused variable Build libcman Make the _sync calls available to non-pthread applications. Better error messages Don't need -lpthread Replace the old nodeids array with idr_ routines. Slightly more sensible error returns for some join functions too. Increase size of gethostbyname_r buffer and improve error if it fails. Don't return an error on normal, synchronous, non-threaded unlock Set join time on local node Use correct errno when reporting errors from gethostbyname2_r man pages are wnderful things - when you read them. Fix usage message for -n Fix memory leak if a joining node fails. Remove redundant struct member Fix a couple of memory leaks Add a test prog that got lost Say something if sendmsg fails. Get userland working again. Set unlock artarg Seperate out device.c into its own module that only depends on the Add DLM_SBF_LVBUPDATED, needed by userland i/f use lvm_operations array to determine whether the LVB was updated or not. zero the difference between a sockaddr_* and a sockaddr_storage Userland cman daemon. libcman to go with userland cman daemon Don't leave debugging on by default. argv[0] should be "cmand" not "cman" Lots more comments in libcman.h Make debugging comfigurable without a recompile. Remove some redundant stuff. Use rwlock rather than rw_semaphore Undo some of the "tidying" done by indent. Use local libcman Use local libcman Allow cman_tool to override the node name when joining. events are not REPLYs. Add command-line utility for managing cluster.conf files. Check in the source rather than a binary, sigh Move cluster_conf into ccs_tool Remove some unnecessary includes. If we failed to resolve the broadcast address, print the interface. Add the new commands to the ccs_tool man page. and fix a bad example. Add options to cman_dispatch() so that callers can filter out non-interesting Address numbers start at 1 Refill the nodes write queue once we are woken up after -EAGAIN. Don't allow name= as a fence argument as it causes problems. Use a different method for findin broadcast addresses. Improve "can't connect to cman" error. Build against dlm-kernel/src2 Missed header, sorry. Add dummy struct to keep compilation clean. unregister_lockspace() now works. Add a userland-cman plugin for magma. I've only tested this with ccsd. deprecated. Only print "waiting for cman" if verbose flag is set. Use getaddrinfo rather than the (obsolete) gethostbyname2 call. Don't lose the port number Bring forward some fixes from the kernel-based cman. Fail if the nodename maps to the loopback device. Fix potential SMP race Use sockaddr_storage rather than sockaddr_in6 Add option for 2-node cluster. move everything around! I always miss one of 'em. Fix some comments A bit more tidying. Move a bit more stuff around. ccs.h is not really a dependancy Fix crash with barriers, caused by overtidying. cmand depends on commands.o not commands.c Put saved messages on the right lists use umask so that permissions on /etc/cluster/cluster.conf are -rw-r----- Add some (hopefully helpful) comments Fix device refcounting file clm.c was initially added on branch STABLE. file saAis.h was initially added on branch STABLE. file saClm.h was initially added on branch STABLE. interesting typo. Build cman against openAIS's libtotem_pg. This is still pretty unstable stuff library commits that go with the last lot. I'm not sure why CVS missed them out. small Makefile fixes Return node addresses. Clear node struct before passing it into cman_get_node() Clear struct before calling cman_get_node() Unbind connections when they die. Recalculate quorum when we join. Add support for AIS security key. Add support for AIS security key. Only ask for POLLOUT notification unless we have something to send. Line up heading Cope with large (>PIPE_BUF) messages coming back from the daemon. Add IPv6 support, with pre-release AIS code. Remove send_queued_events() as it's not used any more. Configure AIS bit using CCS Temporarily enforce static node IDs until this is sorted out. Don't untar the ais source on every build Today's openais tarball has some important fixes in it. comment & message tweaks Strip down barriers so they use the VS features of AIS. Use ccs as the repository of config & node information (that is, after all, Fix up the port notification and add new new PORTOPENED notification Add CMAN_REASON_PORTOPENED callback reason. Patch for clvmd so that it can be used with libcman. With a dynamic libcman, Updated patch that works rather better. Missing comma & comment. If any of the queues have cached message in then return an fd for /dev/zero Refresh cluster FD before each select. Need to return earlier if the socket failed to connect. byteswap the header too new AIS version Use a temp variable for the node address, to avoid potential alignment problems. Don't try and do floating point maths in the preprocessor. Cope with node names in CCS that we can't resolve (provided they have a node ID) Updating ccs_tool's editconf commands for new cman schema (note: the old schema Add uS to the log timestamp Allow CCS version to go forward between cman_tool join & startup, but not Tidy multicast code and use a suitable (ip4 or 6) default if none is specified. Fix usage message Don't overwrite AIS node addresses just because a nodeid matches rename "commskey" to "keyfile" as that's slightly better. Install the AIS keygen program. Use slightly more RFC-compliant multicast address. "cman_tool status" prints the multicast address too. Use inet_ntop() to print IP addresses so the look nicer. Don't send two config change events for a "cman_tool expected -e". Lock ourself in memory Add DLM_LKF_FORCEUNLOCK so device.c doesn't have to muck about with Don't die if we get SIGPIPE Stick a version number in the pipe protocol so we can protect ourself from Don't default to 0 votes, it's annoying. Set the version number Pull slightly updated openais tarball. Slightly better error handling. Don't show joined time for non-members. Fix log message that could crash daemon new AIS If AIS passes us a node ID then we should beleive it. Read key after we've filled in the whole of the totem_config struct, as it new AIS incorporating nodeid zeroing patch that got missed. add htons() etc on nodeid in totem_ip_address. Don't spam ccsd with diff config entry names. (re-)read the two_node flag from CVS so we can change it on the fly. Update to latest AIS (taken straight from svn now) Sort nodes by nodeid CCS node names override temporary ones created from the IP address. Comment tidy sockets should be non-blocking! Tell aispoll to remove the client if it errors. Call out to group_tool for "cman_tool services". Reinstate lowcomms_close() to tidy up the output queue when a node leaves Oops, typo. Also tidy comments so they fit on an 80char line. Allow re-reads of CCS to set a new expected_votes value. Some of those messages should go to syslog Show 2node flag & error state in cman_tool status I hate CVS, why does it always ignore this directory? Put externs in the header file. Fix race where a lock could be unlocked before dlm_lock has completed. Allow non-root users to create the default lockspace. udev rules file for DLM file 51-udev-dlm.rules was initially added on branch RHEL4. Use libcman Add quorum device interface back in. Don't allow a user to release a lockspace if other users have it open. Show nodeid in "cman_tool status" output. Update to 0.71 of openais. Make nodeid mandatory Run cman as an AIS component. This allows other AIS services to be used Give aisexec longer to get started. Now that all the CCS accesses are in the Allow stderr log messages from ais if "cman_tool join -d" specified. Check if ccsd is running before attempting to start the daemon. Detach the daemon rather better Don't do endian conversion on nodeids, ais does that for us now. Get rid of some redundant stuff Use latest openais - with improved CPG support. ln -sf doesn't do what I though it did. so rm the symlink before creating as per lon: add cman_get/set_private routines. Return an error if range locks are attempted. Install and run cman from libexec This is a bit neater. Oops, slight bug in that Makefile there. Fix some errors in usage. Don't need to override ${sbindir} anymore Update patch so it works with new build system Now compiles against upstream (-mm kernel) DLM. Return the correct address length for a node. Return correct status sign for fundamental errors. As a courtesy, zero the whole node address field. Make errno handling a little more consistent Don't clear too much ! change SaNameT to struct cpg_name. Small reversion. The sequence number was pointless. Make sure p->ls is NULL when we start up. OR we could crash at shutdown. Remove some assertions that could fail in normal circumstances. Patch up so they compile. (mainly taking out query calls) Use latest AIS with CPG integrated into it. Odd things could happen if the lock space already existed but the device Don't return an error if we had to create the control device! Don't send extra state change message. two-phase cman shutdown. libcman changes for last one. Need to fix my CVS repository. Tidy up. and remove some unused structs & constants Try and keep ABI stable between versions. Fix includes OK, OK, I give in. We must register for events. libcman doesn't ncessarily return a padded sockaddr_storage. But it /does/ clvmd no longer needs patching for GFS2 Tidy help message, fix cluster name override, tidy "status" code. Use the cluster name as the AIS key if no keyfile is specified. "make install" now includes AIS headers & libraries. Fix typo of the year: "quorumdev_poo" More documentation on API calls. Make things a little clearer. Return EBUSY if we don't know whether a remote node "is_listening" or not. Loads more documentation for each call. Update to latest openAIS code. Update to latest AIS. we not don't need to patch the AIS sources as it Remove a (now) spurious include Do recursive queries on CCS and store in the objdb. Print a space between IP addresses Explain cna_address s/Blocked/blocked/ when displaying quorum. Add recognition of ALTNAME tags in CCS to specify multiple ethernet Allow users to specify log level on command line 'addnodeids' command adds node IDs to nodes that don't have them. Use revised ais loading system (that now works correctly) on FC5 Don't call lcr_component_register twice as it /really/ annoys aisexec. Use new AIS logging functions. Change the default port. If people try to run old and new cman on the Fix (I hope) hierarchy descending in CCS. Admin sockets can't get notifications either. Add API to keep track of when and how a node was last fenced. Inform cman when we have fenced a node. Use latest openAIS which, I think, will fix Dave's trouble Use new OpenAIS with important CPG fix. This needs libcman now too Disable VS filter Don't go into an infinite busy loop if cman shuts down. Add facility to assign a multicast address to the cluster. A rag bag of stuff: Make the examples compile and do something. Return all multicast addresses. Also return a list of (cman) ports Remove redundant kernel examples. Mention ccs_tool as a way of adding nodeids to a cluster.conf file. Tidy some of the inter-file communications. Be careful about returning junk as a node address. Get rid of some redundant calls Fix key file override. Don't let port number default to junk. Missing semicolon! oh dear. Make it background properly. Don't busy-wait if a write fails. Fix some stats as reported by cman_tool status Allow loggin facility to be configured. Add support for OpenAIS RRP. Add missing swab_header(). Do the big-endian conversion so it actually works. Align some message structures better Don't segv if the node has no "votes" property. Don't call a parameter 'time' as it confuses some compilers. Use new OpenAIS We need to build libccs -fPIC because it's included by cman which is a Remove dlm32 as all the 32/64 conversions are done in the kernel now. Build against installed headers rather the ./configured kernel source. Add include so we get a prototype for syscall() Add missing include. Make it clear that admin sockets can't receive callbacks either. Pull latest openAIS Update to OpenAIS with patch for CPG alignment bug Fix build error. Build using installed openais Run aisexec from SBINDIR gah! forgot to remove the /cman off the end of SBINDIR Count votes correctly - buy not shadowing variables, sigh. Don't force unwanted flags on people. Make SBINDIR default to /usr/sbin so we can find aisexec Make sure we ${libexecdir}/lcrso - packagers need it. Don't copy the agent name if it's NULL. Don't lose the end of a lock name for RRP use "active" rather than "passive" on Steve's advice. Update to use new openAIS totemip & totempg APIs. if we can't get the latest config from CCS, poll it until we do. Some systems need #include <signal.h> and who are we to deny them ? We don't really need to include signal.h twice :) Set a good example by checking return values. Create a pipe between cman_tool and the cman daemon so that it can At startup, check that ALL nodes in CCS have nodeids assigned. "group" should be "amf" Add a confchg callback to libcman, similar to the openAIS ones. Fix a bug in the demo prog. initialise confchg_callback Rename 'private' to 'privdata' so it doesn't upset C++ Fix strdup braindamage that probably caused segfaults when nodes Cope with short writes to the cman socket. Cope with a node being fenced manually and then going offline (ie someone Add struct entry for .flow_control to keep latest openais happy. Don't even start up if the local host name resolves to 127.0.0.1 Add some extra semantics to CMAN to cope with openAIS rejoins. A bit of a hack to cope with the race condition where dlm_controld gets Don't fence a node if it has already been fenced. If there are disallowed (AISONLY) nodes in the cluster, then name & shame them. Avoid spurious messages. and also fix an odd node count when nodes Sigh, got the condition back-to-front. 'while' should be an 'if' Get notifications BEFORE getting state otherwise we have a race condition. fix CMAN_DISPATCH_ALL. Lon's patch to user /etc/sysconfig/cman for customisation. Set the default token timeout to 5 seconds. It can still be overridden On Steven Dake's recommendation, also up the token_retransmit count to 20. fix bz#213747 if an AISONLY node dies, mark it DEAD Always compile in debug logging - you never know when it might come in handy Set join_timeout and consensus_timeout to higher defaults as per Tell cman when the config file has been updated Don't truncate the node name when we check for it unqualified. That 'if' really should have been a 'while'. Fix minor bug where cman_tool join didn't spot that aisexec had started Fix typo that could affect shutdown. Add cluster_id override field to cluster.conf, so that people can manually Increase token timeout to 10s as per bz#216954 Fix bug where cman_dispatch(CMAN_DISPATCH_ONE) could dispatch several Give a better error if the cluster name is too long. Send correct length of quorum device name sent to cman. Clear the node structure before calling cman_get_node(). If there are already queued messages for a client then don't send new ones Don't lose NUL on the end of the fence-agent. quorumdev_poll is in milliseconds, not seconds! Add flood program back. Don't return to 'cman_tool leave' until we are just about to quit. If we get killed by another node then print the reason in English rather Fix typo. Read the LVB every time, rather than not at all. Add threads example Add man page info for ccs_tool addnodeids Don't report 0 exit status as a failure. If exec fails, then tell the parent process. Add delay switch Add -c clustername to help output. Support IP(v4) addresses in cluster.conf If the machine is multi-homed, then using a truncated name in uname but not in Fix bug where we could free an lksb while dlm_lock is still using it. Actually, MAX_INT is a bit of a bad idea under this new system. Newer versions of udev prefer == to = Remove udev file from here as it is confusing. Install udev rules file Add const to libcman Change unsigned char* to char* for compatibility with openais trunk. Don't override <totem secauth> if it appears in cluster.conf. Allow ccs to change the two_node flag. Fix typo in openaisincdir Add swab.h for compiling against openais trunk open_lockspace needs to detect kernel version too, otherwise all lockops Don't link cman with libcman! Use new openais timers Fix timer durations Honour the mode parameter to dlm_create_lockspace() even if the libdlm man pages remove redundant Makefile lines file dlm_ls_query.3 was initially added on branch RHEL4. file dlm_ls_query_wait.3 was initially added on branch RHEL4. file dlm_query.3 was initially added on branch RHEL4. file dlm_query_wait.3 was initially added on branch RHEL4. Don't lost the cluster name if it is specified on the command line Add a "dirty" flag to cman to prevent active clusters merging with one-another. Clear error flag for SET_DIRTY Document that cman_set_dirty() needs an admin socket. Update man page Add some information about setting up multi-home (redundant ring) Add some info about openais.conf parameters Mention the openais.conf parameters that cman overrides. Add a 'cman_tool debug' command that allows cman debugging levels to be Fix spelling of DAEMON, sigh Correctly reduce quorum when a node leaves using "cman_tool leave remove" Allow it to build with -O2 Make it compile with -O2, by fixing a very dodgy cast. Fix compile with -O2 -Werror Use openais logsys functions. check quorum device name length against the right size. Fix type-punned pointer warnings Don't use _logsys functions as I get my wrist slapped. Recalculate quorum based on the expected votes value of a new node. Call "group_tool ls" for cman_tool services Reinstate cman_tool services, which got lost inexplicably. Use "logger_subsys" & "subsys" keys rather than "logger" and "ident". Tidy logsys use. Make sure it compiles against latest openais trunk. Enable to_stderr logging if 'cman_tool join -d' is used. Enhance API to retrive just the quorum device information using cman_get_node() Add missing format string. Add an explanation of the node states shown by "cman_tool nodes" and some informastion about the "disallowed" state. file msgtest.c was initially added on branch RHEL4. Clear out the ports opened list when a node goes down. Reinstate 'cman_tool join -X', allowing people to start a cluster without Print votes of quorum device in cman_tool status Add option to disable kernel_check. Add multi-path capability. Each address we get from cman is now Tidy comments Set networking parameters suitable for running DLM over sctp Some small fixes to the networking param code, thanks to Fabio on IRC Fix altname option Allow rrp_mode to be overridden in cluster.conf totempg_ifaces_get() always copies INTERFACE_MAX addresses Lets see if I can do this commit properly... Use define CMAN_NAME for the purpose for which it was intended Get rid of redundant totemip_parse() call. This was in a bad place and could Add command-line override for 2node mode. Zero namelen when doing an unlock. On 32/64 bit systems it can make a Improve startup error checking and logging. Oops. a bit too much cman3 fell into that last checkin Change a log_printf() into a syslog() so that the die message Implement a nicer way of getting the quorum disk information.
Robert Peterson (206): Fix for Bugzilla Bug 155304 – gnbd_monitor doesn't correctly reset after an This fixes Bugzilla bz 178367. Memory leak when reading from Fix for Bugzilla Bug 178453 – Slow memory leak in /proc/cluster/dlm_dir Fix for bz 186125: gfs_fsck on GFS 6.1TB filesystem gives error and Fix for bugzilla bz 179069: gfs_fsck unable to fix file system. Initial checkin of libgfs2. New and improved gfs2_edit tool based on ncurses. Initial checkin of libgfs. Got rid of automake and added minor fixes. Remove automake-related files and redundant includes. Initial checkin of libgfs2 and related. Incorporate libgfs2. Initial checkin. Add convert tool and libgfs2. Compile libgfs2 first. Removed obsolete gfs2_debug tool. Added refs to libgfs2, gfs2_convert, and gfs2_fsck. Several updates and bug fixes, mainly for gfs2_fsck. Prep for libvolume_id, minor changes due to libgfs2 changes. Ability to print directory details, minor libgfs2 changes. Fix several bugs and add changes necessary to match libgfs2. First working version that uses libgfs2. May still have problems. Remove obsolete parts now in libgfs2. Missed on initial check-in. Misc bug fixes. For example, it was not updating block free counts for fsck. Added gfs2_fsck, added gfs2_convert, renamed gfs2_mkfs to mkfs.gfs2 Moved functions from fsck to libgfs2, added functions to make Got rid of dependency on libgfs. Every is now done with libgfs2 Made all block numbers show in decimal and hex. Minor bug fixes. Fixed a bug when printing stuffed directories. For example, Change copyright to 2006 Add gfs2_fsck to Makefile Fix typo in Makefile Improvements to Makefile, renamed gfs2_mkfs man page to mkfs.gfs2.8 Remove obsolete references to unlinked_tag. This addresses bugzilla bug #156009 - gfs fsck needs a good Added some error reporting back in when checking for gfs2 file systems. Fixed a bug where changes to the root inode are not written to disk. Changes to lock protocol were not saved. Also removed some vestigial Converted file systems had no journals. Fixed problems printing stuffed directories like master and jindex. Fixed bugs regarding acls and eattrs. Also crosswrote some fixes Switch to libvolume_id method of determining pre-existing file Clean up .d files on Make clean rather than distclean Include man pages for convert, fsck, etc., in Makefile. Change gfs to work with new locking infrastructure. New ncurses-based gfs_edit synced from RHEL4 and STABLE branches. Changes necessary due to removal of iddev parts (replaced by libvolume_id). Changes necessary due to removal of iddev parts (replaced by libvolume_id) Changes necessary due to removal of iddev parts (replaced by libvolume_id) Re-add locking.c with its redundant gfs_mount_lockproto and Reverse previous decision on locking.c. Removed unwanted sysfs Changes necessary due to removal of iddev parts (replaced by Remove gulm, dlm and nolock from Makefile. gfs1 will now use A printf to stdout was getting redirected to the daemon's socket This is a bug fix for bz 164499. It allows loopback-mounted files Split read_sb into read_sb and compute_constants like libgfs2. Fix divide by zero because superblock constants were not Accomodate changes Steve Whitehouse made to gfs2's dinode structure. Add /proc/fs/gfs support back in. Rename req_lock to gfs_req_lock to avoid duplicate symbols. Remove iddev from configure script. Patch from Fabio Di Nitto: Service stop was killing daemons, which hung system at umount time. Moved cman_tool from /sbin to /usr/sbin Switch was specified incorrectly for apc power switch. 1. Allow SIGINT signals so that gdb can break into hung mounts. OpenAIS builds for /usr/lib64/openais on x86_64 machines. Add useradd for ais user, added instructions for gfs (1). Fix compilation problems on x86_64 (link against /usr/lib This is a fix for bugzilla bz 164499 (Unable to mount loopback The gfs2 userland tools weren't compiling when cluster configure Got rid of iddev references. Hex values were not shown or printed correctly on x86_64. Fix compile error with vmalloc. Mounting was mistakenly allowed with too few journals. Fix minor compile problem due to missing include. Fix include gfs_ondisk.h to be located in gfs kernel source rather Make block_list use a consistent set of values rather than Get rid of symlink "linux" for referencing includes and This change is for Makefile reform allowing a simple Reset other inode bits when temporarily setting S_IFDIR bit. Fixed segfault converting bitmaps during inode conversion. Fixed segfault in gfs_controld. Get gfs_ondisk.h from local includes, not kernel includes. This is a fix for Bugzilla Bug 203916: groupd daemon segfault and This is the fix for bugzilla 200883: gfs_fsck segfaults. This is a crosswrite from gfs1 for bugzilla bz 200883: gfs_fsck Addendum to bz 200883. If gfs_fsck can't finish initialization, Addendum to bz 200883. If gfs2_fsck can't finish initialization, Add the "-w" (wait) and "-t" (timeout) parameters back in Add -w option back to fence_tool join in cman init script. This is for bugzilla 210162: fence_tool needs -w and -t options This fix is for bugzilla 210641: Race condition hang/failure This is the fix for bugzilla bug 210587: Oops in gfs_get_dentry This is a fix for bugzilla bug 210300: Unknown mount option This is the fix for bugzilla bug 210369: acls are not enabled This is the fix for bugzilla bug 211337: must create core files This is the fix for Bugzilla Bug 210732: ccsd doesn't spot Fix for Bugzilla Bug 211405: If groupd segfaults, dump the most This is the fix for Bugzilla Bug 210344: group_tool does not This is the fix for Bugzilla Bug 214513: gfs2_convert must reject This is the fix for Bugzilla Bug 214621: Allow gfs2_edit to view, This is the fix for Bugzilla Bug 214625: Add group_tool log function This is the fix for Bugzilla Bug 214524: group_tool dump can give Ability for gfs2_edit to handle gfs1 indirect metapointers. Resolves: bz211465 Resolves: bz215817 Resolves: bz208836 - fatal: invalid metadata block Fix another case where lf_dirent_format was not rewritten to disk Resolves: bz216898 Resolves: bz217460: fence_tool man page updates needed. Resolves: bz216902: mkfs.gfs2 allows non-4K block size. Resolves: bz217436: Several updates needed to cluster.conf man page. Resolves: bz217436: Several updates needed to cluster.conf man page. Resolves: bz213763: mkdir takes more time on larger file systems. Resolves: bz217798: Need to port Resource Group optimization from Resolves: bz218134: GFS & GFS2: umount while busy gives bogus error Resolves: bz 219866: GFS init script - FATAL: Module lock_dlm is Resolves: bz 218560: multiple mount points fail with gfs and gfs2 Resolves: bz 219878: gfs2 creation should default to 1 journal Resolves: bz 219876: mount.gfs hangs if there are insufficient Resolves: bz 222747: Remove references to lock_gulm from cluster man pages Resolves: bz 222743: gfs_grow gets the rgindex out of order. Resolves: bz 222871 gfs_fsck runs slower than previous versions Resolves: bz 222933: regression: fence_tool no longer times out Resolves: bz 223506: gfs2_fsck: fatal: invalid metadata block Resolves: bz 223843 GFS2: gfs2_fsck segfaulting on corrupt extended Resolves: bz 223500: gfs2_fsck runs slower than previous version Resolves: bz 222759: gfs_mkfs doesn't zero data after gfs superblock Resolves: bz 222299: gfs knows of directories which it chooses not Misc updates to bring gfs-kernel up to the 2.6.20-rc7 and similar Fixed some bugs and made some improvements. Resolves: bz 222308: mkfs and journal addition for GFS2 should produce Resolves: bz 221743: gfs2_fsck errors still Misc improvements. Better scrolling. You can now scroll through Resolves: bz 229220: gfs_fsck stuck in infinite loop Resolves: bz 229601: gfs_tool fails to report counters Resolves bz: 229222: gfs2_fsck stuck in infinite loop Made hex editing a lot easier (for bz 229484). Fixed several bugs Resolves: Bugzilla Bug 232019: gfs2_fsck doesn't fix an ea problem. Resolves: Bugzilla Bug 232124: gfs2_fsck will create multiple Resolves: Bugzilla Bug 233083: Wrong link command in gfs2-utils Jump from RG index was broken. Resolves: Bugzilla Bug 235060: gfs_fsck: Bad programmer! You forgot Resolves: Bugzilla Bug 235061: gfs_fsck: Bad programmer! You forgot Horrible kludge to allow display/print of the rgs themselves Resolves: bz 223893: gfs2_fsck unable to fix damaged RGs and RG indexes. Resolves: bz 238740: GFS fsck is has problems with resource groups Resolves: bz 229484: gfs_fsck not good at fixing corrupt directory entries Resolves: Bugzilla Bug 234844: Need to add a "gfs2_grow" command Resolves: Bugzilla Bug 239844: mount.gfs2 doesn't work when _netdev Close the /sys/fs directory after using it. Resolves: Bugzilla Bug 239023: gfs2_fsck not good at fixing corrupt Resolves: Bugzilla Bug 242056: GFS2 needs block sizes < 4k (mkfs changes) Resolves: bz 244163: Incorrect output of gfs2_tool sb <dev> all Resolves: bz 240570: Can't mount GFS file system on AoE device Make gfs2_edit handle small different block sizes. Resolves: bz 245360: GFS2: userland tools have problems with small Fix a place where indirect offsets were calculated incorrectly. Add savemeta and restoremeta functions to gfs2_edit Resolves: bz 245360: GFS2: userland tools have problems with small Revolves: bz 245803: GFS2: buffer count underflow for block Resolves: bz 241096: GFS: bug in gfs truncate Resolves: bz 247591: Make default journal size for gfs2 128M I added the ability to recurse indirect blocks. That means Resolves: bz #247591: Make default journal size for gfs2 128M Resolves: bug #248423: gfs2_tool can not set data journal flags as Added ability to parse and print journal information. For example: Resolves: bz #240545: gfs2_fsck should behave more like the other fscks. Resolves: bz 287901: GFS2: fsck errors and corruption with files > 945MB Resolves: bz 291451: gfs2_fsck -n, Bad file descriptor on line 63 of Resolves: bz 291451: gfs2_fsck -n, Bad file descriptor on line 63 of Resolves: bz 247318: Need man page for gfs2_edit Resolves: bz 304001: GFS2: Filesystems with 1k block size won't mount Resolves: bz 247318: Need man page for gfs2_edit Resolves: bz 240545 (addendum). Resolves: bz 295301: Need man page for gfs_edit Resolves: bz 251180: Build time warnings for gfs2 userland tools Add the ability for gfs2_edit to print gfs1 journals. Resolves: 235931: gfs2_edit command to set NOALLOC flag While working on bz #291551, I discovered that gfs2_edit savemeta Minor correction to the previous commit. Bopping through indirect Resolves: bz 291551: gfs2_fsck clears journals without asking. Resolves: bz #334481: gfs2_jadd man page refers to non-existent -T option Resolves: bz 345501: GFS2: gfs2 utils uses non-canonicalized names Resolves: bz 345501: minor correction to previous commit. Resolves: bz 337961: gfs_grow /mountpoint/ does not work Resolves: bz 349601: GFS2 requires straightforward way to determine Resolves: bz 354201: GFS2: gfs2_tool: unknown mountpoint on some Resolves: bz 352581: GFS2: implement gfs2_tool lockdump Fix a divide by zero if the target isn't a gfs or gfs2 file system. Resolves: bz 336561: gfs2_tool accepts jdata flag; man page says no Printing the quota file wasn't printing its contents due to a bug. Resolves: bz 364741: GFS2: gfs2_quota doesn't work unless lock Fixed printing of gfs1 journals. Resolves: bz 352841: GFS2: Evaluate and implement missing gfs2_tool Added ability to save inode extended attributes in "savemeta". gfs2_edit wasn't printing directory entries and extended attributes Add the "printsavedmeta" option to the gfs2_edit man page. Resolves: bz 382581: GFS2: gfs2_fsck: buffer still held for block Resolves: bz 402971: GFS2: gfs2_edit savemeta doesn't save rindex file. Resolves: bz 325151: GFS2: gfs2_fsck changes to system inodes don't stick Resolves: bz 426670: GFS2: man page for gfs2_tool has commented Fixup contributed by Andy Price. Resolves: bz 429633: gfs_tool doesn't recognize GFS file sytem Resolves: bz 223660: man gfs2(8) refers to the gfs2_mkfs manpage
Ross Vandegrift (1): [FENCE] Add fence_ifmib new agent
Ryan McCabe (51): support switches with greater than 9 outlets, and handle lists that run longer than one screen. fence agent for fujitsu-siemens primergy rsb device logout correctly on status check rsb agent, this time named rsb along with a man page and makefile mods renamed to rsb Ignore outlet groups fence_apc_snmp support Add DRAC5 and DRAC4/I support Update the perl fence agents to take the additional command line option -S <path> or stdin param passwd_script=<path> Support the "passwd_script" parameter in the C fence agents. - Document the -S/passwd_script fence params. Support power on/reboot for iLO2 Add 'M' to the getopt string to keep clusvcadm from complaining that M is an invalid option. Make power on work correctly for RIBCL version 2.22 on both iLO2 and iLO: Work around network disruption caused by XenD's bridged networking (bz230783, bz231227). don't try to workaround xend networking when running on a non-xen kernel Convert line breaks to
Rename "private" to "priv" to make the file usable by C++ programs, and wrap the header with extern C { ... } if compiling C++. HP changed the iLO 2 interface again in the latest firmware revision, 1.30 (released on 2007-06-01) Detect bridged networking configurations where additional parameters are supplied to the script. Fix a few (harmless) places where memory is allocated but not freed I stumbled onto hunting down something else. Fix access beyond allocated memory Fix a handful of possible NULL pointer derefs listen() is not supported on SOCK_DGRAM fix bz277781 by accepting "nodename" as a synonym for "node" Fix 314091 add new function to libccs: Fix code that caused warnings on platforms where sizeof(void *) != sizeof(int) Allow valid addresses of nodes even if they're not identical to the way they're specified in cluster.conf E2BIG is more appropriate than ENOSPC here - Fix unsafe string handling: Commit msg with the last commit went missing.. Fix format string bug More format string fixes rgmanager format string fixes Keep gcc from reporting a bogus warning when compiling with -Wformat=2 Compile with -Wformat=2, which will catch usually dangerous format string bugs patch from Marco Ceci to fix 354421 Allow "option=(on|off|reboot)" (currently only fence_ilo takes "action") Fix bz434790 Fix a few misspellings Merge branch 'master' of ssh://sources.redhat.com/git/cluster Feeling pedantic. More spelling fixes. Merge branch 'master' of ssh://sources.redhat.com/git/cluster fix bz277781 by accepting "nodename" as a synonym for "node" fence: fixes and cleanups to fencing.py library libfence: handle EINTR correctly libfence: update copyright notice fence: update apc snmp agent cman: Fix typo that caused start-up to fail libfence: whitespace cleanup
Ryan O'Hara (73): Removed magma dependencies. Fixed compiler warnings. Initial check-in of SCSI persistent reservation fence agent. Initial check-in of SCSI persistent reservation init script. Fix perl cmd declaration that caused sg_persist to fail. Added parameters for chkconfig. Added extra output when verbose option is given. Fix stdin parameter parsing to handle 'name=value' correctly. Added "self" parameter as a way to pass our_name to the agent. Added "self" parament to dispatch_fence_agent. Added "self" parameter to dispatch_fence_agent. Name of node to be fenced is passed via "nodename=" parameter. Initial version of the fence_scsi man page. Updated copywrite and fixed title. Added success and failure commands in start/stop. fence_scsi agent should use "self" rather than try to determine node node. Move scsi_reserve init script to fencing agent directory. Moved from fence/scripts directory. Remove extra argument from log_debug call. Added support for SELinux extended attribute types. Fixed typo. "ccstool" should be "ccs_tool". Early version of a script to help users determine if a logical volume Remove error handling for missing magma plugins. Moved code which signals parent (SIGTERM), which allows the parent process ccsd is now fixed such that it will not daemonize until the socket is ready Add GFS_EATYPE_SECURITY as valid xattr type and increment GFS_EATYPE_LAST. Added gfs_security_init to initialize SELinux xattrs for newly created Add scsi_reserve init script to Makefile so that it gets installed. Remove scsi_reseve from "all". This will be handled by the agent make target. Add code to create initdir if it doesn't exist. Remove unnecessary chmod for scsi_reserve. Added fence_scsi_test to help test SCSI reservation capabilities. Include sd_freeze_count in counters output. Detect and fix potential endia problem in lf_dirent_format. Fix annoying whitespace inconsistency. Fix comment. file scsi_reserve.sysconfig was initially added on branch RHEL4. file scsi_watchdog was initially added on branch RHEL4. file scsi_watchdog.conf was initially added on branch RHEL4. If no password is specified, pass a "-P ''" to the ipmitool to prevent Ignore EPIPE error when sending response. This can happen is, for example, Fix help message to refer to script as 'fence_scsi_test'. Read nodir from lockspace xml node via ccs_get. BZ 240584 - Check to see if device is mounted before creating filesystem. Fix bug where mkfs always exits with EXIT_FAILURE. BZ 249781 - Fix ccs_tool to return EXIT_SUCCESS for most commands. Add ability to format output and filter based on node name. BZ 323111 Fix issue with endian conversion that caused problems for mixed architecture Variable should be quoted in conditional statement. Fix unregister code to report failure correctly. Remove "self" parameter. This was used to specify the name of the node Fix code to use get_key subroutine. Fix split calls to be consistent. Remove the optional LIMIT parameter. Replace /var/lock/subsys/${0##*/} with /var/lock/subsys/scsi_reserve. Fix success/failure reporting when registering devices at startup. Rewrite of get_scsi_devices function. Record devices that are successfully registered to /var/run/scsi_reserve. Allow 'stop' to release the reservation if and only if there are no other Attempt to register the node in the case where it must perform fence_scsi Fix help message to refer to script as 'fence_scsi_test'. BZ 248715 BZ: 373491, 373511, 373531, 373541, 373571, 429033 BZ 441323 : Redirect stderr to /dev/null when getting list of devices. gfs_mkfs: change the way we check to see if a device is mounted cman: add option to init script to prevent joining the fence domain cman: fix typo (#!/bin/bash) from previous commit ip.sh: add sleeptime parameter cman: allow custom xen network bridge scripts fence_scsi: improve logging for debugging fence_scsi: correctly declare key_list BZ 453429: Fix conditional check of $OCF_RESKEY_migration_mapping Fix check_mount to correctly test if device is mounted/busy.
Satoru SATOH (1): fence: Add network interface select option for fence_xvmd
Simone Gotti (1): [rgmanager] Fix fuser parsing on later versions of psmisc
Stanko Kupcevic (40): Fixed bz167769: fs.sh doesn't do 10 & 20 OCF_CHECK_LEVEL checking Added watchdog that reboots if clurgmgrd crashes Fixed bz167217, and handling of DOWNed interfaces Initial checkin of cs-deploy-tool Initial checkin of clumon Minor GUI touch ups Display size in GBs Tooltips and touch-ups Message to add two nodes in order to detect shared storage RPM prerequisites & alpha build Addition of html documentation Package docs, build 0.9.2 clumond c++ rewrite build system specfile, dependencies, buildsystem Daemonization of clumond added logging with multiple levels of verbosity spec and code cleanup Memory leak in Socket, remove pidfile on exit, catch SIGCHLD Memory corruption due to libxml2 not being thread safe Abort command execution after 3 second Restart daemons on upgrades, compile with debugging info Resources used after their release Signal-safe logging Replaced tmp OID with unique one agent and provider READMEs Include pegasus headers for zSeries Added rhcClusterNodesNames, rhcClusterAvailNodesNames, rhcClusterUnavailNodesNames, rhcClusterServicesNames, rhcClusterRunningServicesNames, rhcClusterStoppedServicesNames, rhcClusterFailedServicesNames to rhcCluster, so that users of clients that don't display whole snmp table in a single view (eg. HP OpenView), can see all failed/stopped/... services in one place. Node and service tables haven't been removed. Also, clarified descriptions. Make sure all data is read() from buffer before closing fd Sample snmpd.conf Verify that node is subscribed to proper RHN channels add path to logfile to error popups Remove pvs from list of available storage; don't create VG if no storage configured Improved rpm selection and installation phase GUI touch up FC5 support, start cs-deploy-tool with --fc5 argument "clumon moved under Conga project" message fence_apc_snmp ignores "port" parameter Support "passwd_script" parameter in python fence agents. Fix for bz230134 (can't fence port 1:1 with fence_apc)
Steven Whitehouse (119): N.B. This patch changes the ondisk format of GFS2, so you'll need to remake This patch introduces the tty_write_message() function rather than Sorry - another ondisk format changing check in... A two line patch to fix a bug where the quota_enforce setting is ignored. Remove an unused variable. Fix to quota bug fix... I had accidently commited the RHEL4 patch in This patch removes an allocation of memory which was occuring on every lookup Temporary variable "error" is not required since there is already a This patch moved the i_alloc structure into the incode inode. This results Fix up warnings due to kernel function prototypes not matching a couple of Tidy endianess conversion in lvb.c. Half the macros were not used Removed the hfile_trunc ioctl(). Its not used anywhere at the moment, but Fix sparse warning caused by use of int rather than gfp_t in a function A checkpoint in the sparse annotation. There are still a number of areas This touches just about everything and changes GFS2 to be big-endian on These are the fixes to mkfs in order to get it to build against the new This change makes the "hidden" files and directories visible to users mkfs changes to allow building of filesystems with visible .gfs2_admin Some __read_mostly annotation. Merge the lock harness into gfs since it no longer makes sense to retain Remove some unused code relating to printing the lock state. If we are to Adding back the get/set flags ioctl, which is the one ioctl that I think Update mkfs to use the new ioctl for get/set of flags for files. Remove a test which is no longer needed. Found and fixed an endianess bug where we were using the wrong conversion Write a list of the block numbers of the succeding metadata blocks which Remember to take out my debugging printks :-) All normal (i.e. not journaled) file reads/writes now go via the page The second half of yesterday's patch. This is the recovery part and now mkfs no longer writes the mh_blkno field into the common metadata header GFS2 no longer writes mh_blkno into the common metadata header for This takes mkfs out of the main build system. It should make it much This reverts the dirent ondisk structure to be the same on disk as Added a simple set of compile/install instructions. Prevent "uninitialized block" message printing out since several Change the .gfs2_admin dir such that its no longer linked from the root Fix a bug where the wrong size endian conversion was used to initialize libgfs2: Add support for UUID generation to gfs2_mkfs libgfs2: Remove unused #defines mount.gfs2: Remove unused ondisk2.c file gfs_controld: recv error checking GFS: Send sensible sysfs stuff GFS: Send useful information with uevent messages cman: loading lock_dlm module should be optional in initscript docs: Update some docs doc: Update makefile tool: Remove old perl scripts gfs2_tool: remove unused code mkfs: Remove unused code libgfs2: Remove unused code man: Update man pages to new names, fix refs etc include: Remove almost unused headers libgfs2: Remove three unused functions libgfs2: Remove unused functions from misc.c libgfs2: More fixes in bitmap.c and block_list.c libgfs2: Remove unused code, general clean up libgfs2.h: add externs libgfs2: Fix typo in previous patch Merge branch 'master' of git://git.fedorahosted.org/gfs2-utils libgfs2: Remove pv macros from library libgfs2: Remove RANDOM/SRANDOM macros libgfs2: Change some macros into inline functions libgfs2: Remove another macro libgfs2: format checking for printf-like functions libgfs2: gfs2_disk_hash is a .c not a .h! libgfs2: Move SYS_BASE macro libgfs2: Merge bitmap.c into block_list.c libgfs2: Merge linux_endian.h into libgfs2 libgfs2: Remove ancient, obsolete alpha workaround libgfs2: Merge ondisk.h into libgfs2.h libgfs2: Remove copyright.cf include from libgfs2.h fsck: Using wrong hash size libgfs2: Forgot to remove headers from apps libgfs2: Remove a typedef that was only used once libgfs2: metapointer function is only used internally libgfs2: First go at cutting the number of "die"s libgfs2: test_locking should be in mkfs mount.gfs2: Fix wrong header libgfs2: Move prog_name out of the library gfs_controld: One header removal too far gfs2_edit: Tidy up EXTERN stuff libgfs2: DIV_RU Macro bites the dust mount.gfs2: Move the endian functions out of gfs_ondisk.h Remove unused code from various places gfs2_tool: gettext support mkfs.gfs2: Add gettext support gfs2_tool: Fix misplaced bracket that bob spotted fsck.gfs2: Add gettext support Makefile: Fix problem which crept in earlier Merge branch 'master' of git://git.fedorahosted.org/gfs2-utils gfs2_tool: Use FIFREEZE/FITHAW ioctl gfs2-utils: Further translation updates gfs2_tool: Remove obsolete subcommands libgfs2: Remove unused library function gfs2_tool: Remove ref to non-existent sysfs file gfs2_tool: Remove code to read args/* gfs2_tool: Fix help message man: Remove obsolete info from mount.gfs2 man page man: More updates gfs2_tool: Remove df command from gfs2_tool libgfs2: Use -o meta rather than gfs2meta fs type mkfs.gfs2: Remove dep on libvolume_id gfs_controld: Remove some unused code doc: Add diagram of how things fit together gfs_controld: Clean up & fixes gfs_controld: Remove three unused functions gfs2: man page updates GFS2: Man page update GFS2: Add man page for tunegfs2 GFS2: Clean up initscript GFS2: Make mkfs.gfs2 install to the correct location GFS2: Make mount.gfs2 install to the correct location init.d: Add initscript for gfs_controld GFS2: Add script to create release tarball GFS2: Specify some constants directly gfs_controld: Allow mounts entirely via sysfs/uevent interface gfs_controld: Allow paths names to be changed, but keep defaults build: New build system gfs2_edit: Fix bitmap editing function
Wendy Cheng (27): This patch is part of fix for bugzilla 178469 where the Properly handle error return code from verify_jhead(). Joined work of bugzilla 164331 (Abhijith Das) and 178469 (specsfs): Bugzilla 182057 - patch 3-1: Bugzilla 182057 - patch 3-2: Bugzilla 182057 - patch 3-3: Sync with base kernel data structure changes: Bugzilla 199984: Increasing gt_statfs_slots tunable could significantly Bugzilla 203170 - direct IO deadlock: We'll have the same deadlock as Just found 2.6.18 kernel has something called down_read_non_onwer for Port RHEL4 GFS AIO (asynchronous IO) implementation into RHEL5/FC6 and Bugzilla 211622: GFS1 will asserts at xmote_bh() if DLM grants SHARED Bugzilla 211622 - Root issue is found and fix. Backout the workaround. Bugzilla 214274: GFS has been splitting large writes into smaller atomic Bugzilla 214274: Oops... only directIO has this issue - buffer IO should be bugzilla : 217374 - temporarily disable GFS1 withdraw until bz215962 is ready. Temporarily disable GFS natvie AIO support since it currently breaks GFS(s) expects NFS fh_type and fh_len would have the same value. Apparently we can't remove these two methods from file operations table. Bugzilla 236565 Bugzilla 242759: bugzilla 244343: Bugzilla 239729: Bugzilla 231904: RedHat bugzilla 239727: Bugzilla 239729: Bugzilla 239729:
ccaulfield (1): Allow unnamed parent objects. This fixes a bug where
jparsons (1): Bump MAX_DEVICES in fenced from 4 to 8
rohara (2): fence_scsi.pl: check if nodeid is zero scsi_reserve: add restart option
root (1): [fence] fence_xvmd: Add KVM support; misc cleanups.
cluster-commits@lists.fedorahosted.org