<feed xmlns='http://www.w3.org/2005/Atom'>
<title>glusterfs.git/libglusterfs/src, branch v3.7.0beta2</title>
<subtitle></subtitle>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/'/>
<entry>
<title>features/bit-rot-stub: versioning of objects in write/truncate fop instead of open</title>
<updated>2015-05-10T15:14:33+00:00</updated>
<author>
<name>Raghavendra Bhat</name>
<email>raghavendra@redhat.com</email>
</author>
<published>2015-04-09T10:08:47+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=da48a6a596251c19a8ddb1bdfec3da9744a78b8f'/>
<id>da48a6a596251c19a8ddb1bdfec3da9744a78b8f</id>
<content type='text'>
* This patch brings in the changes where object versioning is done in write and
  truncate fops instead of tracking them in open and create fops. This model
  works for both regular and anonymous fds. It also removes the race associated
  with open calls, create and lookups.

  This patch follows the below method for object versioning and notifications:

  Before sending writev on the fd, increase the ongoing
  version first. This makes anonymous fd write similar to the regular
  fd write by having the ongoing version increased before doing the
  write.

  Do following steps to do versioning:
  1) For anonymous fds set the fd context (so that release is invoked) and add
     the fd context to the list maintained in the inode context.
     For regular fds the above think would have been done in open itself.
  2) Increase the on-disk ongoing version
  3) Increase the in memory ongoing version and mark inode as non-dirty
  3) Once versioning is successfully done send write operation. If
     versioning fails, then fail the write fop.
  5) In writev_cbk mark inode as modified.

&gt; Change-Id: I7104391bbe076d8fc49b68745d2ec29a6e92476c
&gt; BUG: 1207979
&gt; Signed-off-by: Raghavendra Bhat &lt;raghavendra@redhat.com&gt;
&gt; Reviewed-on: http://review.gluster.org/10233
&gt; Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
&gt; Reviewed-by: Vijay Bellur &lt;vbellur@redhat.com&gt;

Change-Id: I4bb86989b5fab02b9ed2950798b1a80e566f1024
BUG: 1220041
Signed-off-by: Raghavendra Bhat &lt;raghavendra@redhat.com&gt;
Reviewed-on: http://review.gluster.org/10722
Reviewed-by: Gaurav Kumar Garg &lt;ggarg@redhat.com&gt;
Tested-by: NetBSD Build System
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* This patch brings in the changes where object versioning is done in write and
  truncate fops instead of tracking them in open and create fops. This model
  works for both regular and anonymous fds. It also removes the race associated
  with open calls, create and lookups.

  This patch follows the below method for object versioning and notifications:

  Before sending writev on the fd, increase the ongoing
  version first. This makes anonymous fd write similar to the regular
  fd write by having the ongoing version increased before doing the
  write.

  Do following steps to do versioning:
  1) For anonymous fds set the fd context (so that release is invoked) and add
     the fd context to the list maintained in the inode context.
     For regular fds the above think would have been done in open itself.
  2) Increase the on-disk ongoing version
  3) Increase the in memory ongoing version and mark inode as non-dirty
  3) Once versioning is successfully done send write operation. If
     versioning fails, then fail the write fop.
  5) In writev_cbk mark inode as modified.

&gt; Change-Id: I7104391bbe076d8fc49b68745d2ec29a6e92476c
&gt; BUG: 1207979
&gt; Signed-off-by: Raghavendra Bhat &lt;raghavendra@redhat.com&gt;
&gt; Reviewed-on: http://review.gluster.org/10233
&gt; Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
&gt; Reviewed-by: Vijay Bellur &lt;vbellur@redhat.com&gt;

Change-Id: I4bb86989b5fab02b9ed2950798b1a80e566f1024
BUG: 1220041
Signed-off-by: Raghavendra Bhat &lt;raghavendra@redhat.com&gt;
Reviewed-on: http://review.gluster.org/10722
Reviewed-by: Gaurav Kumar Garg &lt;ggarg@redhat.com&gt;
Tested-by: NetBSD Build System
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>features/bitrot: Throttle filesystem scrubber</title>
<updated>2015-05-10T12:29:31+00:00</updated>
<author>
<name>Venky Shankar</name>
<email>vshankar@redhat.com</email>
</author>
<published>2015-04-27T16:04:34+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=32865f8650057123a5fcf590c96a1ae3f6d22608'/>
<id>32865f8650057123a5fcf590c96a1ae3f6d22608</id>
<content type='text'>
This patch introduces multithreaded filesystem scrubber based
on throttling option configured for a particular volume. The
implementation "logically" breaks scanning and scrubbing with
the number of scrubber threads auto-configured depending upon
the throttle configuration. Scanning (crawling) is left single
threaded (per brick) with entries scrubbed in bulk. On reaching
this "bulk" watermark, scanner waits until entries are scrubbed.
Bricks for a particular volume have a set of thread(s) assigned
for scrubbing, with entries for each brick scrubbed in a round
robin fashion to avoid scrub "stalls" when a brick (out of N
bricks) is under active scrubbing.

This mechanism helps us implement "pause/resume" with ease: all
one need to do is to cleanup scrubber threads and let the main
scanner thread "wait" untill scrubbing is resumed (where the
scrubber thread(s) are spawned again), therefore continuing
where we left off (unless we restart the deamons, where crawl
initiates from root directory again, but I guess that's OK).

[
    NOTE:

    Throttling is optional for the signer daemon, without which
    it runs full throttle. However, passing "-DBR_RATE_LIMIT_SIGNER"
    predefined in CFLAGS enables CPU throttling (during checksum
    calculation) thereby avoiding high CPU usage.
]

Subsequent patches would introduce CPU throttling during hash
calculation for scrubber.

&gt; Change-Id: I5701dd6cd4dff27ca3144ac5e3798a2216b39d4f
&gt; BUG: 1207020
&gt; Signed-off-by: Venky Shankar &lt;vshankar@redhat.com&gt;
&gt; Reviewed-on: http://review.gluster.org/10511
&gt; Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
&gt; Reviewed-by: Vijay Bellur &lt;vbellur@redhat.com&gt;

Change-Id: I5a125b2d0ac7dafd3e278b7fe4c6c9dd07af76dd
Signed-off-by: Venky Shankar &lt;vshankar@redhat.com&gt;
BUG: 1220041
Reviewed-on: http://review.gluster.org/10720
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Gaurav Kumar Garg &lt;ggarg@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch introduces multithreaded filesystem scrubber based
on throttling option configured for a particular volume. The
implementation "logically" breaks scanning and scrubbing with
the number of scrubber threads auto-configured depending upon
the throttle configuration. Scanning (crawling) is left single
threaded (per brick) with entries scrubbed in bulk. On reaching
this "bulk" watermark, scanner waits until entries are scrubbed.
Bricks for a particular volume have a set of thread(s) assigned
for scrubbing, with entries for each brick scrubbed in a round
robin fashion to avoid scrub "stalls" when a brick (out of N
bricks) is under active scrubbing.

This mechanism helps us implement "pause/resume" with ease: all
one need to do is to cleanup scrubber threads and let the main
scanner thread "wait" untill scrubbing is resumed (where the
scrubber thread(s) are spawned again), therefore continuing
where we left off (unless we restart the deamons, where crawl
initiates from root directory again, but I guess that's OK).

[
    NOTE:

    Throttling is optional for the signer daemon, without which
    it runs full throttle. However, passing "-DBR_RATE_LIMIT_SIGNER"
    predefined in CFLAGS enables CPU throttling (during checksum
    calculation) thereby avoiding high CPU usage.
]

Subsequent patches would introduce CPU throttling during hash
calculation for scrubber.

&gt; Change-Id: I5701dd6cd4dff27ca3144ac5e3798a2216b39d4f
&gt; BUG: 1207020
&gt; Signed-off-by: Venky Shankar &lt;vshankar@redhat.com&gt;
&gt; Reviewed-on: http://review.gluster.org/10511
&gt; Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
&gt; Reviewed-by: Vijay Bellur &lt;vbellur@redhat.com&gt;

Change-Id: I5a125b2d0ac7dafd3e278b7fe4c6c9dd07af76dd
Signed-off-by: Venky Shankar &lt;vshankar@redhat.com&gt;
BUG: 1220041
Reviewed-on: http://review.gluster.org/10720
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Gaurav Kumar Garg &lt;ggarg@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>features/bitrot: Follow xattr naming conventions</title>
<updated>2015-05-10T12:28:43+00:00</updated>
<author>
<name>Venky Shankar</name>
<email>vshankar@redhat.com</email>
</author>
<published>2015-04-09T10:56:31+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=738620a5eeeee3802c09275831ac2b85d4ce91e5'/>
<id>738620a5eeeee3802c09275831ac2b85d4ce91e5</id>
<content type='text'>
Instead of "trusted.glusterfs.bit-rot.*" use "trusted.bit-rot.*"

NOTE:
With this patch, data on existing volumes would be resigned
(which should be OK as of now since we do not expect many
users as of now :-))

&gt; Change-Id: I926c7bca266a9c8f2cb35d57c4d0359aa5cecfa0
&gt; BUG: 1170075
&gt; Signed-off-by: Venky Shankar &lt;vshankar@redhat.com&gt;
&gt; Reviewed-on: http://review.gluster.org/10181
&gt; Tested-by: NetBSD Build System
&gt; Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
&gt; Reviewed-by: Vijay Bellur &lt;vbellur@redhat.com&gt;

Change-Id: I3c18d7dc2db4beaca6e8d8d231b4171a7b18795f
Signed-off-by: Venky Shankar &lt;vshankar@redhat.com&gt;
BUG: 1220041
Reviewed-on: http://review.gluster.org/10718
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Gaurav Kumar Garg &lt;ggarg@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Instead of "trusted.glusterfs.bit-rot.*" use "trusted.bit-rot.*"

NOTE:
With this patch, data on existing volumes would be resigned
(which should be OK as of now since we do not expect many
users as of now :-))

&gt; Change-Id: I926c7bca266a9c8f2cb35d57c4d0359aa5cecfa0
&gt; BUG: 1170075
&gt; Signed-off-by: Venky Shankar &lt;vshankar@redhat.com&gt;
&gt; Reviewed-on: http://review.gluster.org/10181
&gt; Tested-by: NetBSD Build System
&gt; Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
&gt; Reviewed-by: Vijay Bellur &lt;vbellur@redhat.com&gt;

Change-Id: I3c18d7dc2db4beaca6e8d8d231b4171a7b18795f
Signed-off-by: Venky Shankar &lt;vshankar@redhat.com&gt;
BUG: 1220041
Reviewed-on: http://review.gluster.org/10718
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Gaurav Kumar Garg &lt;ggarg@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>core: Global timer-wheel</title>
<updated>2015-05-10T12:27:40+00:00</updated>
<author>
<name>Venky Shankar</name>
<email>vshankar@redhat.com</email>
</author>
<published>2015-04-24T04:40:35+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=1a217b2a0295ca4d9068ee5c17d6a4374cc5f8fc'/>
<id>1a217b2a0295ca4d9068ee5c17d6a4374cc5f8fc</id>
<content type='text'>
Instantiate a process wide global instance of the timer wheel
data structure. Spawning glusterfs* process with option arg
"--global-timer-wheel" instantiates a global instance of
timer-wheel under global context (-&gt;ctx).

Translators can make use of this process wide instance [via a
call to glusterfs_global_timer_wheel()] instead of maintaining
an instance of their own and possibly consuming more memory.
Linux kernel too has a single instance of timer wheel where
subsystems such as IO, networking, etc.. make use of.

Bitrot daemon would be early consumers of this: bitrot translator
instances for multiple volumes would track objects belonging to
their respective bricks in this global expiry tracking data
structure. This is also a first step to move GlusterFS timer
mechanism to use timer-wheel.

&gt; Change-Id: Ie882df607e07acaced846ea269ebf1ece306d6ae
&gt; BUG: 1170075
&gt; Signed-off-by: Venky Shankar &lt;vshankar@redhat.com&gt;
&gt; Reviewed-on: http://review.gluster.org/10380
&gt; Tested-by: NetBSD Build System
&gt; Reviewed-by: Vijay Bellur &lt;vbellur@redhat.com&gt;
&gt; Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;

Change-Id: I35c840daa9996a059699f8ea5af54c76ede7e09c
Signed-off-by: Venky Shankar &lt;vshankar@redhat.com&gt;
Signed-off-by: Gaurav Kumar Garg &lt;ggarg@redhat.com&gt;
BUG: 1220041
Reviewed-on: http://review.gluster.org/10716
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Instantiate a process wide global instance of the timer wheel
data structure. Spawning glusterfs* process with option arg
"--global-timer-wheel" instantiates a global instance of
timer-wheel under global context (-&gt;ctx).

Translators can make use of this process wide instance [via a
call to glusterfs_global_timer_wheel()] instead of maintaining
an instance of their own and possibly consuming more memory.
Linux kernel too has a single instance of timer wheel where
subsystems such as IO, networking, etc.. make use of.

Bitrot daemon would be early consumers of this: bitrot translator
instances for multiple volumes would track objects belonging to
their respective bricks in this global expiry tracking data
structure. This is also a first step to move GlusterFS timer
mechanism to use timer-wheel.

&gt; Change-Id: Ie882df607e07acaced846ea269ebf1ece306d6ae
&gt; BUG: 1170075
&gt; Signed-off-by: Venky Shankar &lt;vshankar@redhat.com&gt;
&gt; Reviewed-on: http://review.gluster.org/10380
&gt; Tested-by: NetBSD Build System
&gt; Reviewed-by: Vijay Bellur &lt;vbellur@redhat.com&gt;
&gt; Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;

Change-Id: I35c840daa9996a059699f8ea5af54c76ede7e09c
Signed-off-by: Venky Shankar &lt;vshankar@redhat.com&gt;
Signed-off-by: Gaurav Kumar Garg &lt;ggarg@redhat.com&gt;
BUG: 1220041
Reviewed-on: http://review.gluster.org/10716
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>CTR/Libgfdb: Log typo fix</title>
<updated>2015-05-10T09:30:18+00:00</updated>
<author>
<name>Joseph Fernandes</name>
<email>josferna@redhat.com</email>
</author>
<published>2015-05-08T11:06:00+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=25bb1061642bcaedfdfcab859a07244c2276571f'/>
<id>25bb1061642bcaedfdfcab859a07244c2276571f</id>
<content type='text'>
Log typo fix for CTR Xlator and Libgfdb

Change-Id: Ia39069a5ce9c48bbee937f1b5c5d749a30c9ac56
BUG: 1220100
Signed-off-by: Joseph Fernandes &lt;josferna@redhat.com&gt;
Reviewed-on: http://review.gluster.org/10742
Reviewed-by: N Balachandran &lt;nbalacha@redhat.com&gt;
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Log typo fix for CTR Xlator and Libgfdb

Change-Id: Ia39069a5ce9c48bbee937f1b5c5d749a30c9ac56
BUG: 1220100
Signed-off-by: Joseph Fernandes &lt;josferna@redhat.com&gt;
Reviewed-on: http://review.gluster.org/10742
Reviewed-by: N Balachandran &lt;nbalacha@redhat.com&gt;
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dht: make lookup-unhashed=auto do something actually useful</title>
<updated>2015-05-10T04:55:09+00:00</updated>
<author>
<name>Jeff Darcy</name>
<email>jdarcy@redhat.com</email>
</author>
<published>2014-05-07T19:31:30+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=243d61575c093c03b9beb014bf9d097646836e95'/>
<id>243d61575c093c03b9beb014bf9d097646836e95</id>
<content type='text'>
The key concept here is to determine whether a directory is "clean" by
comparing its last-known-good topology to the current one for the
volume.  These are stored as "commit hashes" on the directory and the
volume root respectively.  The volume's commit hash changes whenever a
brick is added or removed, and a fix-layout is done.  A directory's
commit hash changes only when a full rebalance (not just fix-layout)
is done on it.  If all bricks are present and have a directory
commit hash that matches the volume commit hash, then we can assume
that every file is in its "proper" place. Therefore, if we look for
a file in that proper place and don't find it, we can assume it's not
on any other subvolume and *safely* skip the global (broadcast to all)
lookup.

Change-Id: Id6ce4593ba1f7daffa74cfab591cb45960629ae3
BUG: 1220064
Reviewed-on-master: http://review.gluster.org/#/c/7702/
Signed-off-by: Jeff Darcy &lt;jdarcy@redhat.com&gt;
Signed-off-by: Shyam &lt;srangana@redhat.com&gt;
Reviewed-on: http://review.gluster.org/10729
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Krishnan Parthasarathi &lt;kparthas@redhat.com&gt;
Reviewed-by: Vijay Bellur &lt;vbellur@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The key concept here is to determine whether a directory is "clean" by
comparing its last-known-good topology to the current one for the
volume.  These are stored as "commit hashes" on the directory and the
volume root respectively.  The volume's commit hash changes whenever a
brick is added or removed, and a fix-layout is done.  A directory's
commit hash changes only when a full rebalance (not just fix-layout)
is done on it.  If all bricks are present and have a directory
commit hash that matches the volume commit hash, then we can assume
that every file is in its "proper" place. Therefore, if we look for
a file in that proper place and don't find it, we can assume it's not
on any other subvolume and *safely* skip the global (broadcast to all)
lookup.

Change-Id: Id6ce4593ba1f7daffa74cfab591cb45960629ae3
BUG: 1220064
Reviewed-on-master: http://review.gluster.org/#/c/7702/
Signed-off-by: Jeff Darcy &lt;jdarcy@redhat.com&gt;
Signed-off-by: Shyam &lt;srangana@redhat.com&gt;
Reviewed-on: http://review.gluster.org/10729
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Krishnan Parthasarathi &lt;kparthas@redhat.com&gt;
Reviewed-by: Vijay Bellur &lt;vbellur@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bitrot/scrub: fix induced throttling in syncop_ftw_throttle()</title>
<updated>2015-05-10T03:13:16+00:00</updated>
<author>
<name>Venky Shankar</name>
<email>vshankar@redhat.com</email>
</author>
<published>2015-04-24T16:13:25+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=9acae44525798d7275c490c4e941fa88d214e46e'/>
<id>9acae44525798d7275c490c4e941fa88d214e46e</id>
<content type='text'>
Failing to reset scanning counter causes "incorrect" delay of around
50 seconds per directory entry. This causes scrubber to run extremely
slowly.

[
    NOTE: This is a temporary fix. With the introduction of token
          bucket based throttling, inducing throttle via sleep()
          call would be unneeded.
]

Also, fix logging messages in scrubber to log brick and full path
of the object which is identified/marked as corrupted.

&gt; Change-Id: Id501bd15dcdbd8a09613f80f9d84050304740027
&gt; BUG: 1170075
&gt; Signed-off-by: Venky Shankar &lt;vshankar@redhat.com&gt;
&gt; Reviewed-on: http://review.gluster.org/10375
&gt; Tested-by: NetBSD Build System
&gt; Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
&gt; Reviewed-by: Raghavendra Bhat &lt;raghavendra@redhat.com&gt;
&gt; Reviewed-by: Gaurav Kumar Garg &lt;ggarg@redhat.com&gt;

Change-Id: I78f227f52f12549d62ecb35cbb70121424f7c2a7
BUG: 1220041
Reviewed-on: http://review.gluster.org/10714
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Vijay Bellur &lt;vbellur@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Failing to reset scanning counter causes "incorrect" delay of around
50 seconds per directory entry. This causes scrubber to run extremely
slowly.

[
    NOTE: This is a temporary fix. With the introduction of token
          bucket based throttling, inducing throttle via sleep()
          call would be unneeded.
]

Also, fix logging messages in scrubber to log brick and full path
of the object which is identified/marked as corrupted.

&gt; Change-Id: Id501bd15dcdbd8a09613f80f9d84050304740027
&gt; BUG: 1170075
&gt; Signed-off-by: Venky Shankar &lt;vshankar@redhat.com&gt;
&gt; Reviewed-on: http://review.gluster.org/10375
&gt; Tested-by: NetBSD Build System
&gt; Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
&gt; Reviewed-by: Raghavendra Bhat &lt;raghavendra@redhat.com&gt;
&gt; Reviewed-by: Gaurav Kumar Garg &lt;ggarg@redhat.com&gt;

Change-Id: I78f227f52f12549d62ecb35cbb70121424f7c2a7
BUG: 1220041
Reviewed-on: http://review.gluster.org/10714
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Vijay Bellur &lt;vbellur@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>core: use reference counting for mem_acct structures</title>
<updated>2015-05-09T21:27:36+00:00</updated>
<author>
<name>Jeff Darcy</name>
<email>jdarcy@redhat.com</email>
</author>
<published>2015-04-28T08:40:00+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=a3af10a801a40fe990ee5db63c6dd6cb97713e4c'/>
<id>a3af10a801a40fe990ee5db63c6dd6cb97713e4c</id>
<content type='text'>
When freeing memory, our memory-accounting code expects to be able to
dereference from the (previously) allocated block to its owning
translator.  However, as we have already found once in option
validation and twice in logging, that translator might itself have
been freed and the dereference attempt causes on of our daemons to
crash with SIGSEGV.  This patch attempts to fix that as follows:

 * We no longer embed a struct mem_acct directly in a struct xlator,
   but instead allocate it separately.

 * Allocated memory blocks now contain a pointer to the mem_acct
   instead of the xlator.

 * The mem_acct structure contains a reference count, manipulated in
   both the normal and translator allocate/free code using atomic
   increments and decrements.

 * Because it's now a separate structure, we can defer freeing the
   mem_acct until its reference count reaches zero (either way).

 * Some unit tests were disabled, because they embedded their own
   copies of the implementation for what they were supposedly testing.
   Life's too short to spend time fixing tests that seem designed to
   impede progress by requiring a certain implementation as well as
   behavior.

Change-Id: Id929b11387927136f78626901729296b6c0d0fd7
BUG: 1219026
Signed-off-by: Jeff Darcy &lt;jdarcy@redhat.com&gt;
Reviewed-on: http://review.gluster.org/10417
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Krishnan Parthasarathi &lt;kparthas@redhat.com&gt;
Reviewed-by: Niels de Vos &lt;ndevos@redhat.com&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/10723
Tested-by: NetBSD Build System
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When freeing memory, our memory-accounting code expects to be able to
dereference from the (previously) allocated block to its owning
translator.  However, as we have already found once in option
validation and twice in logging, that translator might itself have
been freed and the dereference attempt causes on of our daemons to
crash with SIGSEGV.  This patch attempts to fix that as follows:

 * We no longer embed a struct mem_acct directly in a struct xlator,
   but instead allocate it separately.

 * Allocated memory blocks now contain a pointer to the mem_acct
   instead of the xlator.

 * The mem_acct structure contains a reference count, manipulated in
   both the normal and translator allocate/free code using atomic
   increments and decrements.

 * Because it's now a separate structure, we can defer freeing the
   mem_acct until its reference count reaches zero (either way).

 * Some unit tests were disabled, because they embedded their own
   copies of the implementation for what they were supposedly testing.
   Life's too short to spend time fixing tests that seem designed to
   impede progress by requiring a certain implementation as well as
   behavior.

Change-Id: Id929b11387927136f78626901729296b6c0d0fd7
BUG: 1219026
Signed-off-by: Jeff Darcy &lt;jdarcy@redhat.com&gt;
Reviewed-on: http://review.gluster.org/10417
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Krishnan Parthasarathi &lt;kparthas@redhat.com&gt;
Reviewed-by: Niels de Vos &lt;ndevos@redhat.com&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/10723
Tested-by: NetBSD Build System
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/afr : Prevent inode-evict during split-brain resolution</title>
<updated>2015-05-09T08:54:56+00:00</updated>
<author>
<name>Anuradha</name>
<email>atalur@redhat.com</email>
</author>
<published>2015-05-09T04:55:08+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=719c927592cfdb0de88243769d477ca211a2b494'/>
<id>719c927592cfdb0de88243769d477ca211a2b494</id>
<content type='text'>
        Backport of: http://review.gluster.org/#/c/10134/

1) Provided setfattr command to set timeout for split-brain
choice.

2) If split-brain inspection/resolution is being done
from the mount for a file, ref the inode when
split-brain-choice is set.
This inode will be unconditionally unref-ed after timeout
seconds set by the user/default otherwise.

3) Updated the doc and testcase to reflect the changes.

Change-Id: I15c9037dee28855f21e680e7e3632e1f48dba4e1
BUG: 1219388
Reviewed-on: http://review.gluster.org/10134
Reviewed-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
Signed-off-by: Anuradha &lt;atalur@redhat.com&gt;
Reviewed-on: http://review.gluster.org/10679
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
        Backport of: http://review.gluster.org/#/c/10134/

1) Provided setfattr command to set timeout for split-brain
choice.

2) If split-brain inspection/resolution is being done
from the mount for a file, ref the inode when
split-brain-choice is set.
This inode will be unconditionally unref-ed after timeout
seconds set by the user/default otherwise.

3) Updated the doc and testcase to reflect the changes.

Change-Id: I15c9037dee28855f21e680e7e3632e1f48dba4e1
BUG: 1219388
Reviewed-on: http://review.gluster.org/10134
Reviewed-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
Signed-off-by: Anuradha &lt;atalur@redhat.com&gt;
Reviewed-on: http://review.gluster.org/10679
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/ec: data heal implementation for ec</title>
<updated>2015-05-08T22:05:30+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2015-04-25T10:28:09+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=f54b232b3cc61ee9ca76288958537b53de64de53'/>
<id>f54b232b3cc61ee9ca76288958537b53de64de53</id>
<content type='text'>
Data self-heal:
1) Take inode lock in domain 'this-&gt;name:self-heal' on 0-0 range (full file),
   So that no other processes try to do self-heal at the same time.
2) Take inode lock in domain 'this-&gt;name' on 0-0 range (full file),
3) perform fxattrop+fstat and get the xattrs on all the bricks
3) Choose the brick with ec-&gt;fragment number of same version as source
4) Truncate sinks
5) Unlock lock taken in 2)
5) For each block take full file lock, Read from sources write to the sinks, Unlock
6) Take full file lock and see if the file is still sane copy i.e. File didn't become unusable while the bricks are offline.
   Update mtime to before healing
7) xattrop with -ve values of 'dirty' and difference of highest and its own
   version values for version xattr
8) unlock lock acquired in 6)
9) unlock lock acquired in 1)

Change-Id: I6f4d42cd5423c767262c9d7bb5ca7767adb3e5fd
BUG: 1216303
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/10384
Reviewed-on: http://review.gluster.org/10692
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Data self-heal:
1) Take inode lock in domain 'this-&gt;name:self-heal' on 0-0 range (full file),
   So that no other processes try to do self-heal at the same time.
2) Take inode lock in domain 'this-&gt;name' on 0-0 range (full file),
3) perform fxattrop+fstat and get the xattrs on all the bricks
3) Choose the brick with ec-&gt;fragment number of same version as source
4) Truncate sinks
5) Unlock lock taken in 2)
5) For each block take full file lock, Read from sources write to the sinks, Unlock
6) Take full file lock and see if the file is still sane copy i.e. File didn't become unusable while the bricks are offline.
   Update mtime to before healing
7) xattrop with -ve values of 'dirty' and difference of highest and its own
   version values for version xattr
8) unlock lock acquired in 6)
9) unlock lock acquired in 1)

Change-Id: I6f4d42cd5423c767262c9d7bb5ca7767adb3e5fd
BUG: 1216303
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/10384
Reviewed-on: http://review.gluster.org/10692
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
