summaryrefslogtreecommitdiffstats
path: root/libglusterfs/src/gfdb/gfdb_sqlite3.c
Commit message (Collapse)AuthorAgeFilesLines
* tier/libgfdb: Extending log level flexibity in libgfdbJoseph Fernandes2015-11-111-3/+6
| | | | | | | | | | | | | | | Extending log level flexibity to relevant fops and operations This is an extension to http://review.gluster.org/#/c/12491/ Change-Id: I33b2f7732f89f52569fb99baa692c7e3bb4c7ab1 BUG: 1277352 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/12567 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* tier/ctr: Ignore CTR Lookup heal insert errorsJoseph Fernandes2015-11-061-1/+2
| | | | | | | | | | | | | | | | | | | | | | CTR doesnt read from the DB, so to make sure that file records are created it does a heal during a lookup. It remembers the decision in the inode context cache and retrys periodically. When the volume is restarted it looses all the inode cache from the previous time and CTR lookup heals tries the heal again, but this time it finds that the records are already there from sql and logs this error, and remembers this until the volume is restarted or inode is flushed out of inode cache of the brick. Solution: the log levels should be reduced to trace for this case and customers need not see this. Change-Id: I67b568fb6904f8597e2c6d32894a247c4f500b94 BUG: 1277352 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/12491 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* tier/libgfdb: Replacing ASCII query file with binaryJoseph Fernandes2015-11-061-55/+117
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Earlier, when the database was queried we used to save all the queried records in an ASCII format in the query file. This caused issues like filename having ASCII delimiter and used to take a lot of space. The tier.c file also had a lot of parsing code. Here we changed the format of the query file to binary. All the logic of serialization and formating of query record is done by libgfdb. Libgfdb provides API, gfdb_write_query_record() and gfdb_read_query_record(), which the user i.e tier migrator and CTR xlator can use to write to and read from query file. With this binary format we save on disk space i.e reduce to 50% atleast as we are saving GFID's in binary format 16 bytes and not the string format which takes 36 bytes + We are not saving path of the file + we are also saving on ASCII delimiters. The on disk format of query record is as follows, +---------------------------------------------------------------------------+ | Length of serialized query record | Serialized Query Record | +---------------------------------------------------------------------------+ 4 bytes Length of serialized query record | | -------------------------------------------------| | | V Serialized Query Record Format: +---------------------------------------------------------------------------+ | GFID | Link count | <LINK INFO> |..... | FOOTER | +---------------------------------------------------------------------------+ 16 B 4 B Link Length 4 B | | | | -----------------------------| | | | | | V | Each <Link Info> will be serialized as | +-----------------------------------------------+ | | PGID | BASE_NAME_LENGTH | BASE_NAME | | +-----------------------------------------------+ | 16 B 4 B BASE_NAME_LENGTH | | | ------------------------------------------------------------------------| | | V FOOTER is a magic number 0xBAADF00D indicating the end of the record. This also serves as a serialized schema validator. Change-Id: I9db7416fd421e118dd44eafab8b535caafe50d5a BUG: 1272207 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/12354 Reviewed-by: N Balachandran <nbalacha@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* tier/ctr: Solution for db locks for tier migrator and ctr using sqlite ↵Joseph Fernandes2015-10-081-70/+191
| | | | | | | | | | | | | | | | | | | | | | | | | | | version less than 3.7 i.e rhel 6.7 Problem: On RHEL 6.7, we have sqlite version 3.6.2 which doesnt support WAL journaling mode, as this journaling mode is only available in sqlite 3.7 and above. As a result we cannot have to progreses concurrently accessing sqlite, without running into db locks! Well WAL is also need for performace on CTR side. Solution: This solution is to use CTR db connection for doing queries when WAL mode is absent. i,e tier migrator will send sync_op ipc calls to CTR, which in turn will do the query and create/update the query file suggested by tier migrator. Pending: Well this solution will stop the db locks but the performance is still an issue for CTR. We are developing an in-Memory Transaction Log (iMeTaL) which will help boost the CTR performance by doing in memory udpates on the IO path and later flush the updates to the db in a batch/segment flush. Change-Id: Ie3149643ded159234b5cc6aa6cf93b9022c2f124 BUG: 1240577 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/12191 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Luis Pabon <lpabon@redhat.com>
* cluster/tier: add gluster v tier <vol>Dan Lambright2015-09-091-13/+13
| | | | | | | | | | | | | | | | | | | | | | | Currently the tier feature piggy backs off the rebalance command syntax to obtain status and this is clumsy. Introduce a new tier command that can do tier specific operations, starting with volume status to display counters. Old commands: gluster volume attach-tier <vol> [replica count] {bricklist..} gluster volume detach-tier <vol> {start|stop|commit} New commands: gluster volume tier <vol> attach [replica count] {bricklist} | detach {start|stop|commit} | status Change-Id: Ic07b3c6260588162de7d34380f8cbd3d8a7f35d3 BUG: 1255693 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/11984 Reviewed-by: Gaurav Kumar Garg <ggarg@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* tier/ctr: Solving DB Lock issue due to write contention from db connectionsJoseph Fernandes2015-09-081-18/+25
| | | | | | | | | | | | | | | | | | | | | | | Problem: The DB on the brick is been accessed by CTR, for write and tier migrator, for read and write. The write from tier migrator is reseting the heat counters after a cycle. Since we are using sqlite, two connections trying to write would cause a db lock contention. As a result CTR used to fail to update the db. Solution: Using the same db connection of CTR for reseting the heat counters. 1) Introducted a new IPC FOP for CTR 2) After the query do a ipc syncop to the underlying client xlator associated to the brick. 3) CTR in brick will catch the IPC FOP and cleat the heat counters. Change-Id: I53306bfc08dcdba479deb4ccc154896521336150 BUG: 1260730 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/12031 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* all: reduce "inline" usageJeff Darcy2015-09-011-11/+11
| | | | | | | | | | | | | | | | | | | | | | | | | There are three kinds of inline functions: plain inline, extern inline, and static inline. All three have been removed from .c files, except those in "contrib" which aren't our problem. Inlines in .h files, which are overwhelmingly "static inline" already, have generally been left alone. Over time we should be able to "lower" these into .c files, but that has to be done in a case-by-case fashion requiring more manual effort. This part was easy to do automatically without (as far as I can tell) any ill effect. In the process, several pieces of dead code were flagged by the compiler, and were removed. Change-Id: I56a5e614735c9e0a6ee420dab949eac22e25c155 BUG: 1245331 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/11769 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com>
* cluster/tier: fix 64 bit issue with sql query using timesDan Lambright2015-08-131-4/+4
| | | | | | | | | | | | | We overflowed when converting seconds to usecs in preperation for sql queries. The fix uses uint64_t throughout including subexpressions. Change-Id: I59bdb742197400dede97f54735b52030920b0d19 BUG: 1231268 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/11885 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Joseph Fernandes
* tier/libgfdb : Setting Freq counters of un-selected files to zeroJoseph Fernandes2015-08-121-0/+18
| | | | | | | | | | | | | | | | | | | | | | | Change Time Recorder increments the write/read frequency counters on a read or write of a file, if the "features.record-counters" is "on". It is the responsibility of the tiering migrator to reset these counters to zero for un-selected files to reset them to zero as frequency counters are function of promotion/Demotion cycles. If the counters are not set to zero then, 1) the counters may overflow in the DB 2) The file may be wrongly promoted or demoted. This fix will reset the freq counters of un-selected files to zero after promotion/demotion frequency. Change-Id: Ideea2c76a52d421a7e67c37fb0c823f552b3da7a BUG: 1242504 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/11648 Tested-by: Joseph Fernandes Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
* libgfdb/sql: Fixing broken query of find_unchangedJoseph Fernandes2015-07-101-2/+2
| | | | | | | | | | | | | | | | | | | The find_unchanged query should be "write_heat <= defined_heat" AND "read_heat <= defined_heat" and not "write_heat <= defined_heat" OR "read_heat <= defined_heat" Change-Id: Ie82e02aafbb7ea14563007307de3350ea022049a BUG: 1240970 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/11577 Reviewed-by: Dan Lambright <dlambrig@redhat.com> Reviewed-by: mohammed rafi kc <rkavunga@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Joseph Fernandes Tested-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* tier/ctr: Ignore creation of T file and Ctr Lookup heal improvememntsJoseph Fernandes2015-06-261-2/+0
| | | | | | | | | | | | | | | | | | 1) Ignore creation of T file in ctr_mknod 2) Ignore lookup for T file in ctr_lookup 3) Ctr_lookup: a. If the gfid and pgfid in empty dont record b. Decreased log level for multiple heal attempts c. Inode/File heal happens after an expiry period, which is configurable. d. Hardlink heal happens after an expiry period, which is configurable. Change-Id: Id8eb5092e78beaec22d05f5283645081619e2452 BUG: 1235269 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/11334 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* gfdb/libglusterfs : Porting to a new logging frameworkMohamed Ashiq2015-06-101-87/+106
| | | | | | | | | | Change-Id: I61874561fdf2c175d2b3eec0c85c25f12dc60552 BUG: 1194640 Signed-off-by: Mohamed Ashiq <ashiq333@gmail.com> Reviewed-on: http://review.gluster.org/10819 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Joseph Fernandes
* tier/tier.t: Fixing tier.t crash in regression runsJoseph Fernandes2015-05-271-6/+5
| | | | | | | | | | | | | | | | 1) If the database file exists a. Dont try re-creating the db schema b. Dont try re-configuring the db. 2) Dont assert in fini_db () when connection is NULL Change-Id: I15dd103fe7542f70113c1d5e539a99f8cd062be4 BUG: 1163543 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/10870 Tested-by: NetBSD Build System Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* ctr : Fix for heating of files during promotion/demotionJoseph Fernandes2015-04-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fix will solve the heating of the files during the promotion or demotion. Promotion: ~~~~~~~~~ When a file gets promoted it get the current time stamp during creation only, but following writes or reads during the migration wont heat the file. Demotion: ~~~~~~~~ When a file gets demoted it get the wind/unwind time stamp is set to zero. The following writes or reads during the migration wont heat the file. What is remaining ? ~~~~~~~~~~~~~~~~~ Bug 1209129 ( https://bugzilla.redhat.com/show_bug.cgi?id=1209129 ) Inspite of this fix there is still a issue remaining, i.e the heat of the file is not keep intact during a internal rebalance activity i.e a rebalance within a tier. Change-Id: I01e82dc226355599732d40e699062cee7960b0a5 BUG: 1207867 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/10080 Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* libgfdb: fix possible illegal memory access (CID 1288820)Michael Adam2015-04-021-1/+2
| | | | | | | | | | | | | | | | | | Coverity CID 1288820 strncpy executed with a limit equal to the target array size potentially leaves the target string not null terminated. Make sure the copied string is a valid 0 terminated string. Change-Id: I39ff6a64ca5b9e30562226dd34c5b06267b75b87 BUG: 789278 Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-on: http://review.gluster.org/10063 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Poornima G <pgurusid@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Joseph Fernandes <josferna@redhat.com> Tested-by: Joseph Fernandes <josferna@redhat.com>
* Gfdb Query Fix and Volume option fixJoseph Fernandes2015-03-301-2/+2
| | | | | | | | | | | | | | 1) Query fix in find_changed_with_freq() 2) Volume option typo fix for write_freq_threshold and read_freq_threshold Change-Id: I38e154818178aab412b2d7b2914cd29acef66ffb BUG: 1207343 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/10050 Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* Adding Libgfdb to GlusterFSJoseph Fernandes2015-03-181-0/+1101
************************************************************************* Libgfdb | ************************************************************************* Libgfdb provides abstract mechanism to record extra/rich metadata required for data maintenance, such as data tiering/classification. It provides consumer with API for recording and querying, keeping the consumer abstracted from the data store used beneath for storing data. It works in a plug-and-play model, where data stores can be plugged-in. Presently we have plugin for Sqlite3. In the future will provide recording and querying performance optimizer. In the current implementation the schema of metadata is fixed. Schema: ~~~~~~ GF_FILE_TB Table: ~~~~~~~~~~~~~~~~~ This table has one entry per file inode. It holds the metadata required to make decisions in data maintenance. GF_ID (Primary key) : File GFID (Universal Unique IDentifier in the namespace) W_SEC, W_MSEC : Write wind time in sec & micro-sec UW_SEC, UW_MSEC : Write un-wind time in sec & micro-sec W_READ_SEC, W_READ_MSEC : Read wind time in sec & micro-sec UW_READ_SEC, UW_READ_MSEC : Read un-wind time in sec & micro-sec WRITE_FREQ_CNTR INTEGER : Write Frequency Counter READ_FREQ_CNTR INTEGER : Read Frequency Counter GF_FLINK_TABLE: ~~~~~~~~~~~~~~ This table has all the hardlinks to a file inode. GF_ID : File GFID (Composite Primary Key)``| GF_PID : Parent Directory GFID (Composite Primary Key) |-> Primary Key FNAME : File Base Name (Composite Primary Key)__| FPATH : File Full Path (Its redundant for now, this will go) W_DEL_FLAG : This Flag is used for crash consistancy, when a link is unlinked. i.e Set to 1 during unlink wind and during unwind this record is deleted LINK_UPDATE : This Flag is used when a link is changed i.e rename. Set to 1 when rename wind and set to 0 in rename unwind Libgfdb API: ~~~~~~~~~~~ Refer libglusterfs/src/gfdb/gfdb_data_store.h Change-Id: I2e9fbab3878ce630a7f41221ef61017dc43db11f BUG: 1194753 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/9683 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>