zkfan / tungsten-replicator Goto Github PK

Automatically exported from code.google.com/p/tungsten-replicator

Shell 1.69% Ruby 21.93% Smarty 0.63% Java 73.40% Perl 0.30% Batchfile 0.10% JavaScript 1.48% PLpgSQL 0.03% SQLPL 0.22% PLSQL 0.23% Stata 0.01%

tungsten-replicator's People

Contributors

Watchers

tungsten-replicator's Issues

PostgreSQL: support for different PG and Replicator home folders for sandbox-like deployment

In order to be able to setup a sandbox-like PG cluster on a single host (as 
opposed to a virtualized or multi-host environment), two enhancements are 
needed:
[RESOLVED] 1. Ability to define custom port for current PG and [on the slave 
side] for master's PG.
2. Ability to define different path for PG and Replicator home folders for 
current node and for the master's node.

This issue addresses (2).

There are multiple places in the *.rb code where ssh, scp and rsync calls are 
made - all with the same assumption - that PG and Replicator home folders are 
identical on this node and master's node. This assumption does not allow setup 
of a cluster on a single host. We need to make both home folder configurable on 
a per-node basis.

Original issue reported on code.google.com by [email protected] on 23 Apr 2011 at 12:26

Tungsten Replicator fails to extract from MySQL 5.5 binlog if clients use utf8mb4 character set

Binlog extraction fails on MySQL 5.5.11 due to presences of utf8mb4 character 
set, which was added in version 5.5.3.  The problem is that Tungsten does not 
list this name as a synonym for UTF-8 in the MySQL extractor.  

To duplicate: 

1.) Set up a Tungsten manager against MySQL version 5.5.11 (earlier versions 
after 5.5.3 should work).  

2.) Enter the following client data in mysql: 

set names utf8mb4;
set session binlog_format=statement;
create database foo;

3.) The replicator will go offline with the following error in the log: 

INFO   | jvm 1    | 2011/04/25 18:00:38 | 2011-04-25 18:00:38,182 ERROR 
replicator.management.OpenReplicatorManager Received error notification, 
shutting down services: Event extraction failed: Unexpected failure while 
extracting event mysql-bin.000006 (534)
INFO   | jvm 1    | 2011/04/25 18:00:38 | 
com.continuent.tungsten.replicator.extractor.ExtractorException: Unexpected 
failure while extracting event mysql-bin.000006 (534)INFO   | jvm 1    | 
2011/04/25 18:00:38 |   at 
com.continuent.tungsten.replicator.extractor.mysql.MySQLExtractor.extractEvent(M
ySQLExtractor.java:1138)INFO   | jvm 1    | 2011/04/25 18:00:38 |   at 
com.continuent.tungsten.replicator.extractor.mysql.MySQLExtractor.extract(MySQLE
xtractor.java:1158)INFO   | jvm 1    | 2011/04/25 18:00:38 |   at 
com.continuent.tungsten.replicator.extractor.ExtractorWrapper.extract(ExtractorW
rapper.java:95)INFO   | jvm 1    | 2011/04/25 18:00:38 |   at 
com.continuent.tungsten.replicator.extractor.ExtractorWrapper.extract(ExtractorW
rapper.java:1)INFO   | jvm 1    | 2011/04/25 18:00:38 |   at 
com.continuent.tungsten.replicator.pipeline.SingleThreadStageTask.runTask(Single
ThreadStageTask.java:217)INFO   | jvm 1    | 2011/04/25 18:00:38 |   at 
com.continuent.tungsten.replicator.pipeline.SingleThreadStageTask.run(SingleThre
adStageTask.java:148)INFO   | jvm 1    | 2011/04/25 18:00:38 |   at 
java.lang.Thread.run(Thread.java:636)
INFO   | jvm 1    | 2011/04/25 18:00:38 | Caused by: 
java.io.UnsupportedEncodingException:INFO   | jvm 1    | 2011/04/25 18:00:38 |  
 at java.lang.StringCoding.decode(StringCoding.java:188)INFO   | jvm 1    | 
2011/04/25 18:00:38 |   at java.lang.String.<init>(String.java:451)

Original issue reported on code.google.com by [email protected] on 26 Apr 2011 at 1:19

tungsten-installer does not accept --verbose, no-validation and --validate-only unless they are on top

What steps will reproduce the problem?
./tungsten-installer --master-slave (many more options) --verbose

What is the expected output? 
More information, and the ability of using the --verbose option anywhere.
Alternatively, a better error message that says "put this option before any 
--master-slave or --direct"

What do you see instead?
A message warning of an invalid option "--verbose"

Original issue reported on code.google.com by g.maxia on 1 May 2011 at 3:59

Clean up Tungten release directory structure

The Tungsten release directory currently has a rather confusing set of 
configuration files and logs.  This makes it difficult for users to learn to 
operate the replicator and diagnose problems. 

Tungsten replicator configuration files will be simplified so that only active 
configuration files are stored in the tungsten-replicator/conf directory.  All 
other configuration templates or files will be removed to a subdirectory.

Original issue reported on code.google.com by [email protected] on 25 Apr 2011 at 3:24

Source ID is not reported correctly in direct role replication

When using the "direct" role ( a slave that deploys a remote master service), 
the source_id is displayed as the slave host, instead of the master host.

For example, I have the slave on qa.m4.continuent.com, and I create the service 
as follows:

./configure -b
./configure-service --create \
    -c tungsten.cfg \
    --role=direct \
    --extract-db-host=qa.r1.continuent.com \
    --extract-db-port=3306 \
    --extract-db-user=tungsten \
    --extract-db-password=secret \
    --master-host=qa.r1.continuent.com \
    --service-type=remote \
    --extract-method=relay \
    --channels=10 dragon


mysql -e 'select task_id, seqno,source_id, eventid from 
tungsten_dragon.trep_commit_seqno'
+---------+-------+----------------------+--------------------------------+
| task_id | seqno | source_id            | eventid                        |
+---------+-------+----------------------+--------------------------------+
|       0 |   608 | qa.r4.continuent.com | 000002:0000000000109477;208969 |
|       1 |   599 | qa.r4.continuent.com | 000002:0000000000107845;208960 |
|       2 |   599 | qa.r4.continuent.com | 000002:0000000000107845;208960 |
|       3 |   599 | qa.r4.continuent.com | 000002:0000000000107845;208960 |
|       4 |   599 | qa.r4.continuent.com | 000002:0000000000107845;208960 |
|       5 |   599 | qa.r4.continuent.com | 000002:0000000000107845;208960 |
|       6 |   599 | qa.r4.continuent.com | 000002:0000000000107845;208960 |
|       7 |   607 | qa.r4.continuent.com | 000002:0000000000109296;208968 |
|       8 |   608 | qa.r4.continuent.com | 000002:0000000000109477;208969 |
|       9 |   599 | qa.r4.continuent.com | 000002:0000000000107845;208960 |
+---------+-------+----------------------+--------------------------------+

I would expect to see qa.r1.continuent.com as source_id.

Why is this important?
One typical scenario is when a replicator supports two unrelated database 
servers in the same host, each of which replicates from a different master. 
Another scenario is when we use the direct role to implement multiple masters 
or a fan-in topology.
Then the source ID becomes very important.

Original issue reported on code.google.com by g.maxia on 20 Apr 2011 at 12:42

documentation about backup and restore mentions wrong result status

According to the Tungsten Replicator Guide, pag 41:

"Both backup and restore operations return Tungsten Replicator to the 
OFFLINE:NORMAL if they succeed."

That is not true for the restore operation. At the end of the restore, the 
replicator is ONLINE.

Either the manual is incorrect, or the replicator is getting the wrong status 
at the end of the operation.

Original issue reported on code.google.com by g.maxia on 21 Apr 2011 at 2:00

Some records get lost in multiple master replication

In a three master setup, when each master creates a different table and sends 
two records into it, one of the records gets lost, i.e. it is not applied, 
without errors or warnings.

The topology is the following:
server alpha: (HOST1)
local master service alpha, remote slave services bravo and charlie

server bravo: (HOST2)
local master service bravo, remote slave services alpha and charlie

server charlie: (HOST3)
local master service charlie, remote slave services bravo and alpha


The commands executed for this test were the following

$MYSQL -h $HOST1 -e 'drop table if exists test.t1'
$MYSQL -h $HOST2 -e 'drop table if exists test.t2'
$MYSQL -h $HOST3 -e 'drop table if exists test.t3'
$MYSQL -h $HOST1 -e 'create table test.t1(i int)'
$MYSQL -h $HOST2 -e 'create table test.t2(i int)'
$MYSQL -h $HOST3 -e 'create table test.t3(i int)'

MAXRECS=2
echo "inserting $MAXRECS records into each of the three masters. Please wait"
for CNT in $(seq 1 $MAXRECS)
do
    $MYSQL -h $HOST1 -e "insert into test.t1 values ($CNT)"
    $MYSQL -h $HOST2 -e "insert into test.t2 values ($CNT)"
    $MYSQL -h $HOST3 -e "insert into test.t3 values ($CNT)"
done

This was the result:
Retrieving data from the masters
qa.m1.continuent.com
+----+---+------+
| t | c | s |
+----+---+------+
| t1 | 2 | 3 |
| t2 | 1 | 1 |
| t3 | 2 | 3 |
+----+---+------+

qa.m2.continuent.com
+----+---+------+
| t | c | s |
+----+---+------+
| t1 | 2 | 3 |
| t2 | 2 | 3 |
| t3 | 2 | 3 |
+----+---+------+

qa.m3.continuent.com
+----+---+------+
| t | c | s |
+----+---+------+
| t1 | 2 | 3 |
| t2 | 1 | 1 |
| t3 | 2 | 3 |
+----+---+------+

As you can see, one record is missing for table t2 in both HOST1 and HOST3.

Looking at the logs, I could determine that the command "insert into test.t2 
values (2)' was missing from the THL for services alpha and charlie.
Please see the attached logs, trepctl and thl output, configuration files, and 
some more observation files for more detail.


 All     Comments    Work Log    Change History           Sort Order: [Ascending order - Click to sort in descending order]
[ Permlink ]
Comment by Giuseppe Maxia [24/Mar/11 06:18 PM]
logs, and other monitoring data for TUC-302

[ Permlink ]
Comment by Giuseppe Maxia [25/Mar/11 03:47 AM]
The failure could be related to block commit.
Changing the loop this way works:

for CNT in $(seq 1 $MAXRECS)
do
    $MYSQL -h $HOST1 -e "insert into test.t1 values ($CNT)"
    sleep 0.1
    $MYSQL -h $HOST2 -e "insert into test.t2 values ($CNT)"
    sleep 0.1
    $MYSQL -h $HOST3 -e "insert into test.t3 values ($CNT)"
    sleep 0.1
done


[ Permlink ]
Comment by Giuseppe Maxia [25/Mar/11 05:39 AM]
It seems definitely related to block commit
Changing the following lines in static-SERVICENAME.properties, the bug does not 
show up.
replicator.stage.d-pq-to-dbms.blockCommitRowCount=1
replicator.stage.q-to-dbms.blockCommitRowCount=1

[ Permlink ]
Comment by Robert Hodges [29/Mar/11 12:37 AM]
I have confirmed the block commit problem in other tests. It appears that it is 
enough to set the block commit values to 1 on the remote services only.

The problem may be related to auto-commit transactions, which seem to mess up 
our demarcation of the originating service of particular transactions.

Original issue reported on code.google.com by [email protected] on 14 Apr 2011 at 9:18

tungsten-installer missing option : --start

tungsten-installer is missing an option that will start the replicator after 
the installation.

I suggest 
--start, which will only start the replicator in all servers
and
--start-and-report, which will start the replicator and then run 
  trepctl --services

Original issue reported on code.google.com by g.maxia on 1 May 2011 at 4:21

Replication failure after long delay between queries...

Setup of systems.
1. Four XEN VMs on the same subnet. (Different XEN boxes)
2. Full master/master replication setup.

What steps will reproduce the problem?
1. Setup everything, run updates all is well.
2. Let sit for 10 hours
3. Run trepctl services to verify all is marked online.
4. Run a simple create.
5. Run trepctl services on all boxes, server I ran queries from is offline, all 
other services are attempting to synch to that master.
6. Run trepctl -service alpha online # on master. It goes back online.
7. run trepctl services on other boxes, "alpha" shows offline.
8. Run trepctl -service alpha online # on a slave, alpha goes online, that 
service goess offline:error.

What is the expected output? 
Everything should be replicated all services online. No manual intervention.

What do you see instead?
Master which ran queries is offilne. Most information is above in steps.

What version of the product are you using? On what operating system?
2.0.2, on CentOS 5.5

Please provide any additional information below.
log/trepsvc.log of ALPHA (abbreviated, one line from before my query, the rest 
after.) Attached as file...

Original issue reported on code.google.com by [email protected] on 22 Apr 2011 at 4:53

Attachments:

trepsvc.log

option names for the same thing should be common to all modes and have a unique name

the following options 
--slave-thl-directory
--slave-relay-directory

--thl-directory
--relay-directory

Should be unified. Only the last two should be kept, and allowed in both 
--master-slave and --direct modes.

--tungsten-home should also be allowed in both modes (see issue#45 for more 
details)

Original issue reported on code.google.com by g.maxia on 1 May 2011 at 4:46

[deleted issue]

[deleted issue]

Need installation support for direct pipeline where master is on a remote host

Tungsten does not have installation support for direct pipelines where the 
master is on a different host from the host on which we are applying data.  

A proposed solution is to have configure-service support extract-db-host, 
extract-db-port, etc. options that can override the host name, port, login, and 
password for the head extractor.

Original issue reported on code.google.com by [email protected] on 19 Apr 2011 at 10:16

Tungsten slaves react poorly when they cannot obtain needed sequence number from master

The master/slave connection protocol does not handle certain corner cases that 
arise when the master log does not contain values needed by the slave.  

Case 1:  Master log starts at higher value than that needed by slave. 

1. Start up a master and a slave with service name "foo" and confirm they are 
connected. 
2. Stop the slave. 
3. Perform one or more transactions on the master MySQL instance. 
4. Stop the master, clear the THL log files, but *do not* clear the value in 
tungsten_foo. 
5. Restart the master.  The master will start numbering its log higher than the 
slave's next required sequence number.  
6. Restart the slave.  

At this point, the slave will print the following: 

$trepctl status
...
pendingError           : Event extraction failed: Client handshake failure: 
Client response validation failed: Client log has higher sequence number than 
master: client source ID=logos2 seqno=0 client epoch number=0
...

This message is false.  It should say that the master could not find the 
requested ID.  Here is a stack trace that shows where the error arises on the 
master. 

INFO   | jvm 1    | 2011/04/23 09:08:35 | 
com.continuent.tungsten.replicator.thl.THLException: Client log has higher 
sequence number than master: client source ID=logos2 seqno=0 client epoch 
number=0
INFO   | jvm 1    | 2011/04/23 09:08:35 |   at 
com.continuent.tungsten.replicator.thl.ConnectorHandler$LogValidator.validateRes
ponse(ConnectorHandler.java:94)
INFO   | jvm 1    | 2011/04/23 09:08:35 |   at 
com.continuent.tungsten.replicator.thl.Protocol.serverHandshake(Protocol.java:21
6)
INFO   | jvm 1    | 2011/04/23 09:08:35 |   at 
com.continuent.tungsten.replicator.thl.ConnectorHandler.run(ConnectorHandler.jav
a:179)
INFO   | jvm 1    | 2011/04/23 09:08:35 |   at 
java.lang.Thread.run(Thread.java:636)

Case 2: Master starts at higher value than uninitialized slave. 

1. Create a new master and slave on service foo but do not start them. 
2. Start the master only.  
3. Perform one or more transactions on the master MySQL instance. 
4. Stop the master, clear the THL log files, but *do not* clear the value in 
tungsten_foo. 
5. Restart the master.  The master will start numbering its log higher than the 
slave's next required sequence number.  
6. Start the slave.  

In this case the slave just hangs in the GOING-ONLINE:SYNCHRONIZING.  It will 
try to keep reconnecting to the master without signaling an error.  You must 
kill the slave process as it does not respond to 'trepctl offline'.  On some 
systems the JVM will run out of file descriptors and print a message like the 
following: 

INFO   | jvm 1    | 2011/04/19 13:10:12 | WARNING: RMI TCP Accept-0:
accept loop for
ServerSocket[addr=0.0.0.0/0.0.0.0,port=0,localport=52405] throws
INFO   | jvm 1    | 2011/04/19 13:10:12 | java.net.SocketException:
Too many open files
INFO   | jvm 1    | 2011/04/19 13:10:12 |       at
java.net.PlainSocketImpl.socketAccept(Native Method)
INFO   | jvm 1    | 2011/04/19 13:10:12 |       at
java.net.PlainSocketImpl.accept(PlainSocketImpl.java:408)
INFO   | jvm 1    | 2011/04/19 13:10:12 |       at
java.net.ServerSocket.implAccept(ServerSocket.java:462)
INFO   | jvm 1    | 2011/04/19 13:10:12 |       at
java.net.ServerSocket.accept(ServerSocket.java:430)
INFO   | jvm 1    | 2011/04/19 13:10:12 |       at
sun.rmi.transport.tcp.TCPTransport
$AcceptLoop.executeAcceptLoop(TCPTransport.java:369)
INFO   | jvm 1    | 2011/04/19 13:10:12 |       at
sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:
341)
INFO   | jvm 1    | 2011/04/19 13:10:12 |       at
java.lang.Thread.run(Thread.java:662)

Original issue reported on code.google.com by [email protected] on 23 Apr 2011 at 4:31

WARNING from running trepctl services

What steps will reproduce the problem?
1. run trepctl services
2. Check log file

What is the expected output? What do you see instead?
Nothing. Warning below complaining about wait_timeout.

What version of the product are you using? On what operating system?
2.0.2, CentOS 5.5, MySQL 5.1.56

Please provide any additional information below.

INFO   | jvm 1    | 2011/04/22 12:54:26 | 2011-04-22 12:54:25,985 INFO  
replicator.conf.PropertiesManager Reading static properties file: 
/opt/tungsten-replicator-2.0.2/cluster-home/bin/../../tungsten-replicator/conf/s
tatic-alpha.properties
INFO   | jvm 1    | 2011/04/22 12:54:26 | 2011-04-22 12:54:25,986 INFO  
replicator.conf.PropertiesManager Reading static properties file: 
/opt/tungsten-replicator-2.0.2/cluster-home/bin/../../tungsten-replicator/conf/s
tatic-beta.properties
INFO   | jvm 1    | 2011/04/22 12:54:26 | 2011-04-22 12:54:25,987 INFO  
replicator.conf.PropertiesManager Reading static properties file: 
/opt/tungsten-replicator-2.0.2/cluster-home/bin/../../tungsten-replicator/conf/s
tatic-charlie.properties
INFO   | jvm 1    | 2011/04/22 12:54:26 | 2011-04-22 12:54:25,989 INFO  
replicator.conf.PropertiesManager Reading static properties file: 
/opt/tungsten-replicator-2.0.2/cluster-home/bin/../../tungsten-replicator/conf/s
tatic-delta.properties
INFO   | jvm 1    | 2011/04/22 12:54:26 | Apr 22, 2011 12:54:26 PM 
org.drizzle.jdbc.internal.mysql.MySQLProtocol executeQuery
INFO   | jvm 1    | 2011/04/22 12:54:26 | WARNING: Could not execute query 
org.drizzle.jdbc.internal.common.query.DrizzleQuery@674009d2: Variable 
'wait_timeout' can't be set to the value of '99999999'

Original issue reported on code.google.com by [email protected] on 22 Apr 2011 at 4:56

`./trepctl stop` doesn't stop the Replicator process

I wanted to stop Replicator process, thus executed `./trepctl stop`, but it 
didn't work as expected. I had to use `./replicator stop`.

 All     Comments    Work Log    Change History      Subversion Commits           Sort Order: [Ascending order - Click to sort in descending order]
[ Permlink ]
Comment by Stephane Giron [28/Jan/11 06:48 AM]  Delete
OpenReplicatorManagerCtrl extract :

                else if (command.equals(Commands.STOP))
                {
                    // TODO: Remove?
                    expectLostConnection = true;
                    // manager.stop();
                }

Should that code (or similar) be enabled again ?

Original issue reported on code.google.com by [email protected] on 14 Apr 2011 at 9:35

Upgrade THL to use buffered I/O

THL log file access is implemented using the Java RandomFile class, which 
treats files as a long array.  This class does not use buffered I/O and makes 
heavy use of disk metadata operations, which results in painfully slow behavior 
on slow, highly contended, or NFS-mounted disk.   Also, it makes it hard to 
support a single-writer, many-readers model due to the high level of disk I/O 
for reads. 

The THL will be rewritten to use standard Java stream classes with buffered I/O 
for both reading and writing, with the following user-visible behavior: 

1. THL performance will not be bound by performance of the underlying disk 
subsystem. 
2. The THL will support multiple readers reading the same region of the log 
without causing additional I/O overhead for each reader. 
3. Users will be able to control the I/O buffer size and optionally select 
compression of files on disk to reduce storage requirements.  
4. Otherwise the semantics of the THL will be unchanged.

Original issue reported on code.google.com by [email protected] on 21 Apr 2011 at 11:45

Tungsten replicator with binary support enabled fails on MySQL 5.5

If you select binary statement transfer during configuration Tungsten 
Replicator fails on MySQL 5.5 with the following error: 

org.drizzle.jdbc.internal.common.QueryException: Could not connect:
Bad handshake at
org.drizzle.jdbc.internal.mysql.MySQLProtocol.<init>(MySQLProtocol.java:158)
at org.drizzle.jdbc.DrizzleDriver.connect(DrizzleDriver.java:73)

Tungsten uses the drizzle JDBC driver to because it can submit statements as 
bytes on slaves.  This enables correct replication of statements that contain 
embedded binary.  The problem is that drizzle has a problem connecting to MySQL 
5.5 and needs a patch. 

There is a workaround for this that will get users up and running albeit with 
somewhat degraded capabilities: 

1.) Select binary transfer = false at service configuration time.  This will 
select the MySQL Connector/J driver. 

2.) After the service is created but before you start the replicator, edit the 
service properties file (e.g., conf/static-myservice.properties) and ensure the 
following property is set to true: 

# Use bytes to transfer strings.  This should be set to true when using MySQL
# row replication and the table or column character set differs from the
# character set in use on the server.
replicator.extractor.mysql.usingBytesForString=true

You can now connect.  However, you must use row replication (e.g., set global 
binlog_format=row) or binary and embedded charset data will not replicate 
correctly.

Original issue reported on code.google.com by [email protected] on 16 Apr 2011 at 4:43

Tungsten replicator does not check ports availability at startup

When the replicator starts, it does not check if the ports that it has been 
assigned in the configuration file are free.
It will start nonetheless, and the user need to check the log file to find what 
happened.
The replicator should not start when these conditions happen. It should check 
that everything that is needed is available, and fail if any of the resources 
are missing.

How to repeat:

(1) Install Tungsten replicator from any version.
The unpack a different version in a different directory, and install again. 
Tungsten will start, and then complain in the log file.

(2) install Tungsten in any directory, giving as logs dir a non-existing 
directory.
     Tungsten will start, but it will not work, as it will be unable of creating the service directory

Original issue reported on code.google.com by [email protected] on 14 Apr 2011 at 9:14

Race condition when a DDL statement is immediately followed by a DML statement on the same object

This bug is migrated from JIRA (http://forge.continuent.org/jira/browse/TUC-256)

Using Tungsten 2.0 parallel applier, there is an apparent race condition when 
loading data from a dump.

Case A. If I load all table definitions at once, and then start inserting data, 
all is well

Case B. If I load table definitions followed by insert statements (which is the 
normal case using mysqldump), replication breaks with " ERROR 
replicator.pipeline.SingleThreadStageTask [d-pq-to-dbms-3] Event application 
failed: seqno=10 fragno=0 message=java.sql.SQLException: Statement failed on 
slave but succeeded on master"
Looking at the error log, it says that the replicator was trying to insert into 
a table that did not exist.

How to repeat:
1) install mysql replication
2) replace the SQL thread with Tungsten 2.0 parallel apply, as described in the 
"Tungsten 2.0 installation", ver. 0.6
3) get the employees database from http://launchpad.net/test-db and load it to 
the master.
    The loading script will create all table definitions, and then load the data. This is case A.
4) Using mysqldump, dump all data from the employees database into a file
    mysqldump -h master employees > employees.sql
5) drop the employees database
6) load the employees database from the dump file
    mysql -h master -e 'create schema employees'
    mysql -h master employees < employees.sql
7) monitor the replicator operations.


 All     Comments    Work Log    Change History           Sort Order: [Ascending order - Click to sort in descending order]
[ Permlink ]
Comment by Giuseppe Maxia [18/Feb/11 04:30 AM]
Archive containing the Tungsten build manifest, tungsten.cfg, all replicator 
properties files, the replicator error log

[ Permlink ]
Comment by Giuseppe Maxia [18/Feb/11 06:01 AM]
The bug is also reproducible using the Sakila database, instead of the 
employees db.

1) download the sakila database from 
http://downloads.mysql.com/docs/sakila-db.tar.gz
2) load it to the master
3) create a mysqldump file.
4) delete the sakila database (in the master)
5) create the sakila schema again (in the master)
6) load the dump file (in the master)

Notice that this problem is not related to innodb. If I convert all tables to 
MyISAM, the error is reproduced in exactly the same way.

[ Permlink ]
Comment by Robert Hodges [20/Feb/11 08:31 PM]
This is an interesting problem. To find out more I ran 'thl -service <name> 
list' on the resulting Tungsten log and looked at the assigned shard IDs. This 
revealed a couple of oddities.

1.) The mysqldump puts commands like ALTER TABLE 'name' DISABLE KEYS in 
comments to ensure they run on old versions. This breaks parsing to detect the 
database name, because Tungsten tries to optimize regular expression parsing by 
only parsing lines that begin with a well-known keyword like CREATE or DROP. 
This should be merely an annoyance, since the shard ID is set to #UNKNOWN, 
which causes a critical section on parallel apply. This should hurt performance 
but is otherwise harmless. .

Here's an example.

SEQ# = 191 / FRAG# = 0 (last frag)
- TIME = 2011-02-20 16:30:44.0
- EVENTID = 000061:0000000067991301;895613
- SOURCEID = logos1
- STATUS = COMPLETED(2)
- SCHEMA = employees
- METADATA = [mysql_server_id=1;service=percona;shard=#UNKNOWN]
- TYPE = com.continuent.tungsten.replicator.event.ReplDBMSEvent
- OPTIONS = [##charset = UTF-8, autocommit = 1, sql_auto_is_null = 1, 
foreign_key_checks = 0, unique_checks = 0, sql_mode = 'NO_AUTO_VALUE_ON_ZERO', 
character_set_client = 33, collation_connection = 33, collation_server = 8]
- SQL(0) = /*!40000 ALTER TABLE `dept_emp` DISABLE KEYS */ /* ___SERVICE___ = 
[percona] */

2.) The MySQL extractor is generating empty events following such commands. 
This may be similar to problems we had previously with REPLACE commands. Here's 
the following event:

SEQ# = 192 / FRAG# = 0 (last frag)
- TIME = 2011-02-20 16:30:44.975
- EVENTID = 000061:0000000067991328;0
- SOURCEID = logos1
- STATUS = COMPLETED(2)
- METADATA = [service=percona;shard=#UNKNOWN]
- TYPE = com.continuent.tungsten.replicator.event.ReplDBMSEvent

These do not seem problematic here because they are single events.

3.) Finally, we have cases where we are generating empty events following very 
large INSERT events, as shown below.

SEQ# = 193 / FRAG# = 0
- TIME = 2011-02-20 16:30:44.0
- EVENTID = 000061:0000000069013054;895613
- SOURCEID = logos1
- STATUS = COMPLETED(2)
- SCHEMA = employees
- METADATA = [mysql_server_id=1;service=percona;shard=employees]
- TYPE = com.continuent.tungsten.replicator.event.ReplDBMSEvent
- OPTIONS = [##charset = UTF-8, autocommit = 1, sql_auto_is_null = 1, foreign_ke
y_checks = 0, unique_checks = 0, sql_mode = 'NO_AUTO_VALUE_ON_ZERO', character_s
et_client = 33, collation_connection = 33, collation_server = 8]
- SQL(0) = INSERT INTO `dept_emp` VALUES (10001,'d005','1986-06-26','9999-01-01'

. . . lots of data . . .

'd004','1995-03-27','9999-01-01'),(32540,'d004','1985-08-29','9999-01-01'),(3254
1,'d008','1985-06-04','9999-01-01'),(32542,'d008','1989-05-12','9999-01-01') /*
___SERVICE___ = [percona] */
SEQ# = 193 / FRAG# = 1 (last frag)
- TIME = 2011-02-20 16:30:45.601
- EVENTID = 000061:0000000069013081;0
- SOURCEID = logos1
- STATUS = COMPLETED(2)
- METADATA = [service=percona;shard=#UNKNOWN]
- TYPE = com.continuent.tungsten.replicator.event.ReplDBMSEvent

This latter case looks like a real problem, because the 2nd event fragment will 
create a critical section. The critical section can run out of order as follows.

1.) The queue for the 'percona' shard will apply the event fragment but will 
not commit because it only has the first fragment.
2.) The critical section will now run. If it affects the uncommitted data or 
conflicts with it there will be a problem.

This case exposes another race condition on parallel apply that I will describe 
in the next comment.

[ Permlink ]
Comment by Robert Hodges [20/Feb/11 08:45 PM]
The second race condition is as follows. I believe this is the source of the 
error that Guiseppe noted.

1.) CREATE TABLE runs in the partition for its shard, which is percona in my 
tests.

2.) ALTER TABLE within comment appears. As it is assigned to the #UNKNOWN 
shard, it creates a critical section.

3.) The ParallelQueue blocks until all apply queues are empty but 
_does_not_ensure_ that the statement in step 1 has completed.

4.) The ALTER TABLE is placed in a queue. If there is any latency processing 
the CREATE TABLE command, the ALTER TABLE will run first.

This is a defect of critical section handling. We need to ensure the apply 
threads have committed before starting a critical section. This could be done 
by sending a control event that has a shared semaphore and have each apply 
thread decrement the semaphore after the control event commits.

Original issue reported on code.google.com by [email protected] on 14 Apr 2011 at 9:03

Blocking: #191, #191, #360

CREATE FUNCTION command causes a bi-directional replication loop

Tungsten needs to insert proper comments in the CREATE FUNCTION command to 
indicate the service and thereby avoid replication loops. CREATE VIEW has a 
similar problem.  We need a permanent answer to this problem. Here's the 
current options:

1.) Only send DDL from a single location, as statements like CREATE VIEW create 
unavoidable loops. We can protect against accidents by refusing to accept 
dangerous DDL or unrecognized statements from one side of the loop.

2.) Create a UDF to set the server ID externally. This appears to be possible.

For now we will implement #1. There will be an option on the 
BidiReplicationFilter called allowBidiUnsafe that rejects any statement that 
looks like to cause a loop in BidiDirectionalReplication if set to false.

Original issue reported on code.google.com by [email protected] on 14 Apr 2011 at 9:12

When using Row replication, session variables are not replicated, which can bring errors

Here is an example of the stack trace that can result. 

INFO | jvm 1 | 2011/02/18 16:58:37 | 2011-02-18 16:58:37,125 ERROR 
replicator.applier.JdbcApplier PreparedStatement failed: INSERT INTO 
`mats`.`test2` ( `i` , `j` , `v` ) VALUES ( ? , ? , ? )
INFO | jvm 1 | 2011/02/18 16:58:37 | Arguments:
INFO | jvm 1 | 2011/02/18 16:58:37 | - ROW# = 0
INFO | jvm 1 | 2011/02/18 16:58:37 | - COL(1: i) = 14
INFO | jvm 1 | 2011/02/18 16:58:37 | - COL(2: j) = 12
INFO | jvm 1 | 2011/02/18 16:58:37 | - COL(3: v) = [B@187b287
INFO | jvm 1 | 2011/02/18 16:58:37 | 2011-02-18 16:58:37,125 ERROR 
replicator.pipeline.SingleThreadStageTask [q-to-dbms] Event application failed: 
seqno=1 fragno=0 
message=com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViol
ationException: Cannot add or update a child row: a foreign key constraint 
fails (`mats`.`test2`, CONSTRAINT `test2_ibfk_1` FOREIGN KEY (`j`) REFERENCES 
`test` (`i`))
INFO | jvm 1 | 2011/02/18 16:58:37 | 
com.continuent.tungsten.replicator.applier.ApplierException: 
com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException: 
Cannot add or update a child row: a foreign key constraint
 fails (`mats`.`test2`, CONSTRAINT `test2_ibfk_1` FOREIGN KEY (`j`) REFERENCES `test` (`i`))
INFO | jvm 1 | 2011/02/18 16:58:37 | at 
com.continuent.tungsten.replicator.applier.JdbcApplier.applyOneRowChangePrepared
(JdbcApplier.java:943)
INFO | jvm 1 | 2011/02/18 16:58:37 | at 
com.continuent.tungsten.replicator.applier.JdbcApplier.applyRowChangeData(JdbcAp
plier.java:1050)
INFO | jvm 1 | 2011/02/18 16:58:37 | at 
com.continuent.tungsten.replicator.applier.JdbcApplier.apply(JdbcApplier.java:11
16)
INFO | jvm 1 | 2011/02/18 16:58:37 | at 
com.continuent.tungsten.replicator.applier.ApplierWrapper.apply(ApplierWrapper.j
ava:90)
INFO | jvm 1 | 2011/02/18 16:58:37 | at 
com.continuent.tungsten.replicator.pipeline.SingleThreadStageTask.runTask(Single
ThreadStageTask.java:375)
INFO | jvm 1 | 2011/02/18 16:58:37 | at 
com.continuent.tungsten.replicator.pipeline.SingleThreadStageTask.run(SingleThre
adStageTask.java:119)
INFO | jvm 1 | 2011/02/18 16:58:37 | at java.lang.Thread.run(Thread.java:619)
INFO | jvm 1 | 2011/02/18 16:58:37 | Caused by: 
com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException: 
Cannot add or update a child row: a foreign key constraint fails 
(`mats`.`test2`, CONSTRAINT `test2_ibfk_1`
FOREIGN KEY (`j`) REFERENCES `test` (`i`))
INFO | jvm 1 | 2011/02/18 16:58:37 | at 
sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
INFO | jvm 1 | 2011/02/18 16:58:37 | at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorI
mpl.java:39)
INFO | jvm 1 | 2011/02/18 16:58:37 | at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorA
ccessorImpl.java:27)
INFO | jvm 1 | 2011/02/18 16:58:37 | at 
java.lang.reflect.Constructor.newInstance(Constructor.java:513)
INFO | jvm 1 | 2011/02/18 16:58:37 | at 
com.mysql.jdbc.Util.handleNewInstance(Util.java:409)
INFO | jvm 1 | 2011/02/18 16:58:37 | at 
com.mysql.jdbc.Util.getInstance(Util.java:384)
INFO | jvm 1 | 2011/02/18 16:58:37 | at 
com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1041)
INFO | jvm 1 | 2011/02/18 16:58:37 | at 
com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3566)
INFO | jvm 1 | 2011/02/18 16:58:37 | at 
com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3498)
INFO | jvm 1 | 2011/02/18 16:58:37 | at 
com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1959)
INFO | jvm 1 | 2011/02/18 16:58:37 | at 
com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2113)
INFO | jvm 1 | 2011/02/18 16:58:37 | at 
com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2568)
INFO | jvm 1 | 2011/02/18 16:58:37 | at 
com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2113)
INFO | jvm 1 | 2011/02/18 16:58:37 | at 
com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2409)
INFO | jvm 1 | 2011/02/18 16:58:37 | at 
com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2327)
INFO | jvm 1 | 2011/02/18 16:58:37 | at 
com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2312)
INFO | jvm 1 | 2011/02/18 16:58:37 | at 
com.continuent.tungsten.replicator.applier.JdbcApplier.applyOneRowChangePrepared
(JdbcApplier.java:924)
INFO | jvm 1 | 2011/02/18 16:58:37 | ... 6 more

Original issue reported on code.google.com by [email protected] on 14 Apr 2011 at 9:16

Add a timestamp column into trep_commit_seqno table in order to know when this table was updated

The trep_commit_seqno table needs a timestamp column so that we can see when 
latency was last updated.

Original issue reported on code.google.com by [email protected] on 14 Apr 2011 at 9:10

tungsten-installer does not check if a given directory name is a regular file

tungsten-installer takes some defaults to create directories, for example 
/opt/continuent/thl

It does not check if that file name exists as a regular file. Therefore, if a 
regular file named /opt/continuent/thl exists, the installation fails

Original issue reported on code.google.com by g.maxia on 1 May 2011 at 3:23

installation fails with tungsten-installer

What steps will reproduce the problem?

TUNGSTEN_BASE=$HOME/testtool

./tools/tungsten-installer \
    --master-slave --master-host=qa.r1.continuent.com \
    --datasource-user=tungsten \
    --datasource-password=secret \
    --service-name=xyz \
    --home-directory=$TUNGSTEN_BASE \
    --thl-directory=$TUNGSTEN_BASE/thl \
    --relay-directory=$TUNGSTEN_BASE/relay \
    --cluster-hosts=qa.r1.continuent.com,qa.r4.continuent.com


What is the expected output? 

What do you see instead?
Processing status command...
NAME                     VALUE
----                     -----
appliedLastEventId     : NONE
appliedLastSeqno       : -1
appliedLatency         : -1.0
clusterName            : default
currentEventId         : NONE
currentTimeMillis      : 1304264286693
dataServerHost         : qa.r4.continuent.com
extensions             : 
host                   : null
latestEpochNumber      : -1
masterConnectUri       : thl://qa_r1_continuent_com:2112/
masterListenUri        : thl://qa.r4.continuent.com:2112/
maximumStoredSeqNo     : -1
minimumStoredSeqNo     : -1
offlineRequests        : NONE
pendingError           : Stage task failed: remote-to-thl
pendingErrorCode       : NONE
pendingErrorEventId    : NONE
pendingErrorSeqno      : -1
pendingExceptionMessage: hostname can't be null
resourcePrecedence     : 99
rmiPort                : -1
role                   : slave
seqnoType              : java.lang.Long
serviceName            : xyz
serviceType            : unknown
simpleServiceName      : xyz
siteName               : default
sourceId               : qa.r4.continuent.com
state                  : OFFLINE:ERROR
timeInStateSeconds     : 147.09
uptimeSeconds          : 147.415
Finished status command...


Excerpt from log follows

INFO   | jvm 1    | 2011/05/01 17:35:39 | 2011-05-01 17:35:39,598 INFO  
replicator.pipeline.SingleThreadStageTask Unexpected error: Stage task failed: 
remote-to-thl
INFO   | jvm 1    | 2011/05/01 17:35:39 | java.lang.IllegalArgumentException: 
hostname can't be null
INFO   | jvm 1    | 2011/05/01 17:35:39 |   at 
java.net.InetSocketAddress.<init>(Unknown Source)
INFO   | jvm 1    | 2011/05/01 17:35:39 |   at 
com.continuent.tungsten.replicator.thl.Connector.connect(Connector.java:111)
INFO   | jvm 1    | 2011/05/01 17:35:39 |   at 
com.continuent.tungsten.replicator.thl.RemoteTHLExtractor.openConnector(RemoteTH
LExtractor.java:307)
INFO   | jvm 1    | 2011/05/01 17:35:39 |   at 
com.continuent.tungsten.replicator.thl.RemoteTHLExtractor.extract(RemoteTHLExtra
ctor.java:127)
INFO   | jvm 1    | 2011/05/01 17:35:39 |   at 
com.continuent.tungsten.replicator.thl.RemoteTHLExtractor.extract(RemoteTHLExtra
ctor.java:48)
INFO   | jvm 1    | 2011/05/01 17:35:39 |   at 
com.continuent.tungsten.replicator.pipeline.SingleThreadStageTask.runTask(Single
ThreadStageTask.java:217)
INFO   | jvm 1    | 2011/05/01 17:35:39 |   at 
com.continuent.tungsten.replicator.pipeline.SingleThreadStageTask.run(SingleThre
adStageTask.java:148)
INFO   | jvm 1    | 2011/05/01 17:35:39 |   at java.lang.Thread.run(Unknown 
Source)
INFO   | jvm 1    | 2011/05/01 17:35:39 | 2011-05-01 17:35:39,598 ERROR 
replicator.management.OpenReplicatorManager Received error notification, 
shutting down services: Stage task failed: remote-to-thl
INFO   | jvm 1    | 2011/05/01 17:35:39 | java.lang.IllegalArgumentException: 
hostname can't be null
INFO   | jvm 1    | 2011/05/01 17:35:39 |   at 
java.net.InetSocketAddress.<init>(Unknown Source)
INFO   | jvm 1    | 2011/05/01 17:35:39 |   at 
com.continuent.tungsten.replicator.thl.Connector.connect(Connector.java:111)
INFO   | jvm 1    | 2011/05/01 17:35:39 |   at 
com.continuent.tungsten.replicator.thl.RemoteTHLExtractor.openConnector(RemoteTH
LExtractor.java:307)
INFO   | jvm 1    | 2011/05/01 17:35:39 |   at 
com.continuent.tungsten.replicator.thl.RemoteTHLExtractor.extract(RemoteTHLExtra
ctor.java:127)
INFO   | jvm 1    | 2011/05/01 17:35:39 |   at 
com.continuent.tungsten.replicator.thl.RemoteTHLExtractor.extract(RemoteTHLExtra
ctor.java:48)
INFO   | jvm 1    | 2011/05/01 17:35:39 |   at 
com.continuent.tungsten.replicator.pipeline.SingleThreadStageTask.runTask(Single
ThreadStageTask.java:217)
INFO   | jvm 1    | 2011/05/01 17:35:39 |   at 
com.continuent.tungsten.replicator.pipeline.SingleThreadStageTask.run(SingleThre
adStageTask.java:148)
INFO   | jvm 1    | 2011/05/01 17:35:39 |   at java.lang.Thread.run(Unknown 
Source)
INFO   | jvm 1    | 2011/05/01 17:35:39 | 2011-05-01 17:35:39,599 INFO  
replicator.pipeline.SingleThreadStageTask [remote-to-thl-0] Terminating 
processing for stage
INFO   | jvm 1    | 2011/05/01 17:35:39 | 2011-05-01 17:35:39,599 WARN  
replicator.management.OpenReplicatorManager Performing emergency service 
shutdown
INFO   | jvm 1    | 2011/05/01 17:35:39 | 2011-05-01 17:35:39,599 INFO  
replicator.pipeline.SingleThreadStageTask [remote-to-thl-0] Stage event count: 0
INFO   | jvm 1    | 2011/05/01 17:35:39 | 2011-05-01 17:35:39,599 INFO  
replicator.pipeline.Pipeline Shutting down pipeline: slave
INFO   | jvm 1    | 2011/05/01 17:35:39 | 2011-05-01 17:35:39,599 WARN  
replicator.pipeline.SingleThreadStageTask [thl-to-q-0] Received unexpected 
interrupt in stage task: thl-to-q
INFO   | jvm 1    | 2011/05/01 17:35:39 | 2011-05-01 17:35:39,599 INFO  
replicator.pipeline.SingleThreadStageTask [thl-to-q-0] Terminating processing 
for stage
INFO   | jvm 1    | 2011/05/01 17:35:39 | 2011-05-01 17:35:39,599 INFO  
replicator.pipeline.SingleThreadStageTask [thl-to-q-0] Stage event count: 0
INFO   | jvm 1    | 2011/05/01 17:35:39 | 2011-05-01 17:35:39,599 WARN  
replicator.pipeline.SingleThreadStageTask [q-to-dbms-0] Received unexpected 
interrupt in stage task: q-to-dbms
INFO   | jvm 1    | 2011/05/01 17:35:39 | 2011-05-01 17:35:39,600 INFO  
replicator.pipeline.SingleThreadStageTask [q-to-dbms-0] Terminating processing 
for stage
INFO   | jvm 1    | 2011/05/01 17:35:39 | 2011-05-01 17:35:39,600 INFO  
replicator.pipeline.SingleThreadStageTask [q-to-dbms-0] Stage event count: 0
INFO   | jvm 1    | 2011/05/01 17:35:39 | 2011-05-01 17:35:39,600 INFO  
replicator.pipeline.Pipeline Releasing pipeline: slave
INFO   | jvm 1    | 2011/05/01 17:35:39 | 2011-05-01 17:35:39,600 INFO  
replicator.pipeline.StageTaskGroup Releasing task: 0
INFO   | jvm 1    | 2011/05/01 17:35:39 | 2011-05-01 17:35:39,600 INFO  
replicator.conf.ReplicatorRuntime Plug-in released successfully: class 
name=com.continuent.tungsten.replicator.thl.RemoteTHLExtractor
INFO   | jvm 1    | 2011/05/01 17:35:39 | 2011-05-01 17:35:39,600 INFO  
replicator.conf.ReplicatorRuntime Plug-in released successfully: class 
name=com.continuent.tungsten.replicator.thl.THLStoreAdapter
INFO   | jvm 1    | 2011/05/01 17:35:39 | 2011-05-01 17:35:39,600 INFO  
replicator.pipeline.StageTaskGroup Releasing task: 0
INFO   | jvm 1    | 2011/05/01 17:35:39 | 2011-05-01 17:35:39,600 INFO  
replicator.conf.ReplicatorRuntime Plug-in released successfully: class 
name=com.continuent.tungsten.replicator.thl.THLStoreAdapter
INFO   | jvm 1    | 2011/05/01 17:35:39 | 2011-05-01 17:35:39,600 INFO  
replicator.conf.ReplicatorRuntime Plug-in released successfully: class 
name=com.continuent.tungsten.enterprise.replicator.store.ParallelQueueApplier
INFO   | jvm 1    | 2011/05/01 17:35:39 | 2011-05-01 17:35:39,600 INFO  
replicator.pipeline.StageTaskGroup Releasing task: 0
INFO   | jvm 1    | 2011/05/01 17:35:39 | 2011-05-01 17:35:39,600 INFO  
replicator.conf.ReplicatorRuntime Plug-in released successfully: class 
name=com.continuent.tungsten.enterprise.replicator.store.ParallelQueueExtractor
INFO   | jvm 1    | 2011/05/01 17:35:39 | 2011-05-01 17:35:39,600 INFO  
replicator.conf.ReplicatorRuntime Plug-in released successfully: class 
name=com.continuent.tungsten.replicator.filter.MySQLSessionSupportFilter
INFO   | jvm 1    | 2011/05/01 17:35:39 | 2011-05-01 17:35:39,600 INFO  
replicator.conf.ReplicatorRuntime Plug-in released successfully: class 
name=com.continuent.tungsten.enterprise.replicator.filter.BidiRemoteSlaveFilter
INFO   | jvm 1    | 2011/05/01 17:35:39 | 2011-05-01 17:35:39,600 INFO  
replicator.applier.ApplierWrapper Releasing raw applier
INFO   | jvm 1    | 2011/05/01 17:35:39 | 2011-05-01 17:35:39,601 INFO  
replicator.conf.ReplicatorRuntime Plug-in released successfully: class 
name=com.continuent.tungsten.replicator.applier.ApplierWrapper
INFO   | jvm 1    | 2011/05/01 17:35:39 | 2011-05-01 17:35:39,601 INFO  
replicator.thl.Server Stopping server thread
INFO   | jvm 1    | 2011/05/01 17:35:39 | 2011-05-01 17:35:39,601 INFO  
replicator.thl.Server Server thread cancelled
INFO   | jvm 1    | 2011/05/01 17:35:39 | 2011-05-01 17:35:39,601 INFO  
replicator.thl.Server Closing connector handlers for THL Server: store=thl
INFO   | jvm 1    | 2011/05/01 17:35:39 | 2011-05-01 17:35:39,601 INFO  
replicator.thl.Server Closing socket: store=thl host=/0:0:0:0:0:0:0:0 port=2112
INFO   | jvm 1    | 2011/05/01 17:35:39 | 2011-05-01 17:35:39,601 INFO  
replicator.thl.Server THL thread done: store=thl
INFO   | jvm 1    | 2011/05/01 17:35:39 | 2011-05-01 17:35:39,602 INFO  
replicator.thl.CommitSeqnoTable Reduced 0 task entries: 
tungsten_xyz.trep_commit_seqno
INFO   | jvm 1    | 2011/05/01 17:35:39 | 2011-05-01 17:35:39,603 INFO  
replicator.conf.ReplicatorRuntime Plug-in released successfully: class 
name=com.continuent.tungsten.enterprise.replicator.thl.EnterpriseTHL
INFO   | jvm 1    | 2011/05/01 17:35:39 | 2011-05-01 17:35:39,603 INFO  
replicator.conf.ReplicatorRuntime Plug-in released successfully: class 
name=com.continuent.tungsten.enterprise.replicator.store.ParallelQueueStore
INFO   | jvm 1    | 2011/05/01 17:35:39 | 2011-05-01 17:35:39,603 INFO  
replicator.management.OpenReplicatorManager All internal services are shut 
down; replicator ready for recovery
INFO   | jvm 1    | 2011/05/01 17:35:39 | 2011-05-01 17:35:39,603 INFO  
replicator.management.OpenReplicatorManager Sent State Change Notification 
GOING-ONLINE:SYNCHRONIZING -> OFFLINE:ERROR
INFO   | jvm 1    | 2011/05/01 17:35:39 | 2011-05-01 17:35:39,603 WARN  
replicator.management.OpenReplicatorManager Received irrelevant event for 
current state: state=OFFLINE:ERROR event=OfflineNotification
INFO   | jvm 1    | 2011/05/01 17:35:47 | 2011-05-01 17:35:47,766 INFO  
replicator.conf.PropertiesManager Reading static properties file: 
/home/tungsten/testtool/releases/tungsten-replicator-2.0.3/cluster-home/bin/../.
./tungsten-replicator/conf/static-xyz.properties
INFO   | jvm 1    | 2011/05/01 17:38:06 | 2011-05-01 17:38:06,676 INFO  
replicator.conf.PropertiesManager Reading static properties file: 
/home/tungsten/testtool/releases/tungsten-replicator-2.0.3/cluster-home/bin/../.
./tungsten-replicator/conf/static-xyz.properties

Original issue reported on code.google.com by g.maxia on 1 May 2011 at 3:47

Inserting data of type geometry crashes Tungsten master when using row replication

This issue is migrated from http://forge.continuent.org/jira/browse/TUC-143. 

Description      
The following SQL will make the master crash when row replication is enabled:

CREATE TABLE geom (name VARCHAR(64) CHARACTER SET utf8, bincol BLOB, shape 
GEOMETRY, binutf VARCHAR(64) CHARACTER SET utf8 COLLATE utf8_bin);
INSERT INTO geom (name, shape) VALUES ('test', GeomFromText('Point(132865 
501937)'));


This was detected with connector/mysql tests and is probably a regression

 All     Comments    Work Log    Change History           Sort Order: [Ascending order - Click to sort in descending order]
[ Permlink ]
Comment by Gilles Rayrat [11/Aug/10 10:39 AM]
Relevant replicator logs:


INFO | jvm 1 | 2010/08/08 09:37:27 | 2010-08-08 08:37:26,977 ERROR 
extractor.mysql.RowsLogEvent Failure while processing extracted write row event
INFO | jvm 1 | 2010/08/08 09:37:27 | 
com.continuent.tungsten.replicator.extractor.mysql.MySQLExtractException: 
unknown data type 255
INFO | jvm 1 | 2010/08/08 09:37:27 | at 
com.continuent.tungsten.replicator.extractor.mysql.RowsLogEvent.extractValue(Row
sLogEvent.java:812)
INFO | jvm 1 | 2010/08/08 09:37:27 | at 
com.continuent.tungsten.replicator.extractor.mysql.RowsLogEvent.processExtracted
EventRow(RowsLogEvent.java:908)
INFO | jvm 1 | 2010/08/08 09:37:27 | at 
com.continuent.tungsten.replicator.extractor.mysql.WriteRowsLogEvent.processExtr
actedEvent(WriteRowsLogEvent.java:73)
INFO | jvm 1 | 2010/08/08 09:37:27 | at 
com.continuent.tungsten.replicator.extractor.mysql.MySQLExtractor.extractEvent(M
ySQLExtractor.java:792)
INFO | jvm 1 | 2010/08/08 09:37:27 | at 
com.continuent.tungsten.replicator.extractor.mysql.MySQLExtractor.extract(MySQLE
xtractor.java:933)
INFO | jvm 1 | 2010/08/08 09:37:27 | at 
com.continuent.tungsten.replicator.extractor.ExtractorWrapper.extract(ExtractorW
rapper.java:93)
INFO | jvm 1 | 2010/08/08 09:37:27 | at 
com.continuent.tungsten.replicator.pipeline.SingleThreadStageTask.runTask(Single
ThreadStageTask.java:189)
INFO | jvm 1 | 2010/08/08 09:37:27 | at 
com.continuent.tungsten.replicator.pipeline.SingleThreadStageTask.run(SingleThre
adStageTask.java:117)
INFO | jvm 1 | 2010/08/08 09:37:27 | at java.lang.Thread.run(Thread.java:595)
INFO | jvm 1 | 2010/08/08 09:37:27 | 2010-08-08 08:37:26,978 ERROR 
extractor.mysql.MySQLExtractor Failed to extract from mysql-bin.000002 (741)
INFO | jvm 1 | 2010/08/08 09:37:27 | 
com.continuent.tungsten.replicator.extractor.mysql.MySQLExtractException: 
unknown data type 255
INFO | jvm 1 | 2010/08/08 09:37:27 | at 
com.continuent.tungsten.replicator.extractor.mysql.RowsLogEvent.extractValue(Row
sLogEvent.java:812)
INFO | jvm 1 | 2010/08/08 09:37:27 | at 
com.continuent.tungsten.replicator.extractor.mysql.RowsLogEvent.processExtracted
EventRow(RowsLogEvent.java:908)
INFO | jvm 1 | 2010/08/08 09:37:27 | at 
com.continuent.tungsten.replicator.extractor.mysql.WriteRowsLogEvent.processExtr
actedEvent(WriteRowsLogEvent.java:73)
INFO | jvm 1 | 2010/08/08 09:37:27 | at 
com.continuent.tungsten.replicator.extractor.mysql.MySQLExtractor.extractEvent(M
ySQLExtractor.java:792)
INFO | jvm 1 | 2010/08/08 09:37:27 | at 
com.continuent.tungsten.replicator.extractor.mysql.MySQLExtractor.extract(MySQLE
xtractor.java:933)
INFO | jvm 1 | 2010/08/08 09:37:27 | at 
com.continuent.tungsten.replicator.extractor.ExtractorWrapper.extract(ExtractorW
rapper.java:93)
INFO | jvm 1 | 2010/08/08 09:37:27 | at 
com.continuent.tungsten.replicator.pipeline.SingleThreadStageTask.runTask(Single
ThreadStageTask.java:189)
INFO | jvm 1 | 2010/08/08 09:37:27 | at 
com.continuent.tungsten.replicator.pipeline.SingleThreadStageTask.run(SingleThre
adStageTask.java:117)
INFO | jvm 1 | 2010/08/08 09:37:27 | at java.lang.Thread.run(Thread.java:595)

[ Permlink ]
Comment by Stephane Giron [12/Aug/10 10:07 AM]
Probably not a regression as it seem it is not handled by 1.2 branch as well, 
from code inspection.

MySQLBinlog defines :
    public static final int MYSQL_TYPE_GEOMETRY = 255;

but this is never used/referenced

[ Permlink ]
Comment by Gilles Rayrat [13/Aug/10 05:26 AM]
Agreed - not a regression
Postponing to 1.3.1

Original issue reported on code.google.com by [email protected] on 14 Apr 2011 at 9:08

Auto-commit DDL statements break block commit

Auto-commit DDL statements create a number of problems for block commit.  
Imagine you have the following sequence of statements that Tungsten tries to 
execute in block commit: 

<begining of block>
CREATE TABLE
INSERT 
INSERT 
DROP TABLE
<end of block>

MySQL will auto-commit before and after the CREATE and DROP statements.  
However, Tungsten does not recognize this and only updates trep_commit_seqno 
after the DROP TABLE.  This has two effects: 

1.) If there is a failure in the block, the CREATE and DROP statements do not 
roll back.  This means on restart Tungsten will try to re-submit statements 
that are already applied.  

2.) In multi-master replication we depend on seeing the updates to 
trep_commit_seqno to identify SQL from a remote master.  However, in this case, 
the INSERT statements will commit without a trailing update to 
trep_commit_seqno and will therefore be tagged with the local service instead 
of the remote service that they need. 

The correct behavior in both cases is that Tungsten should recognize an 
auto-commit statement and commit with update to trep_commit_seqno.  There is 
still a window where DDLs succeed but the trep_commit_seqno statement fails, 
but this will be restricted to a single statement.

Original issue reported on code.google.com by [email protected] on 27 Apr 2011 at 1:18

Event extraction failed: Relay log task has unexpectedly terminated

Using parallel replication, replication stops unexpectedly, with a message:
"Event extraction failed: Relay log task has unexpectedly terminated"


binlogs in the MASTER
-rw-rw---- 1 mysql mysql 1.1G Apr 23 20:39 mysql-bin.000001
-rw-rw---- 1 mysql mysql 140M Apr 23 20:40 mysql-bin.000002
-rw-rw---- 1 mysql mysql 765M Apr 23 20:49 mysql-bin.000003
-rw-rw---- 1 mysql mysql 954M Apr 23 20:59 mysql-bin.000004
-rw-rw---- 1 mysql mysql 818M Apr 23 21:07 mysql-bin.000005
-rw-rw---- 1 mysql mysql 952M Apr 23 21:17 mysql-bin.000006
-rw-rw---- 1 mysql mysql 1.1G Apr 23 21:28 mysql-bin.000007
-rw-rw---- 1 mysql mysql 1.1G Apr 23 21:40 mysql-bin.000008
-rw-rw---- 1 mysql mysql  51M Apr 23 21:41 mysql-bin.000009
-rw-rw---- 1 mysql mysql  149 Apr 23 21:41 mysql-bin.000010
-rw-rw---- 1 mysql mysql  149 Apr 23 21:56 mysql-bin.000011
-rw-rw---- 1 mysql mysql  106 Apr 23 21:56 mysql-bin.000012
-rw-rw---- 1 mysql mysql  228 Apr 23 21:56 mysql-bin.index


Service in the slave was started with:
./configure -b
./configure-service --create \
    -c tungsten.cfg \
    --role=direct \
    --extract-db-host=qa.r1.continuent.com \
    --extract-db-port=3306 \
    --extract-db-user=yyyyy \
    --extract-db-password=xxxxx \
    --master-host=qa.r1.continuent.com \
    --service-type=remote \
    --extract-method=relay \
    --channels=20 dragon

./tungsten-replicator/bin/trepctl online -from-event mysql-bin.000001:4

binlogs relayed in the SLAVE
$ ls -lh relay-logs/dragon/
total 5.5G
-rw-r--r-- 1 tungsten tungsten 765M Apr 23 22:03 mysql-bin.000003
-rw-r--r-- 1 tungsten tungsten 954M Apr 23 22:05 mysql-bin.000004
-rw-r--r-- 1 tungsten tungsten 818M Apr 23 22:07 mysql-bin.000005
-rw-r--r-- 1 tungsten tungsten 952M Apr 23 22:08 mysql-bin.000006
-rw-r--r-- 1 tungsten tungsten 1.1G Apr 23 22:10 mysql-bin.000007
-rw-r--r-- 1 tungsten tungsten 1.1G Apr 23 22:11 mysql-bin.000008
-rw-r--r-- 1 tungsten tungsten  51M Apr 23 22:11 mysql-bin.000009
-rw-r--r-- 1 tungsten tungsten  149 Apr 23 22:11 mysql-bin.000010
-rw-r--r-- 1 tungsten tungsten  149 Apr 23 22:11 mysql-bin.000011
-rw-r--r-- 1 tungsten tungsten  149 Apr 23 22:16 mysql-bin.000012
-rw-r--r-- 1 tungsten tungsten  106 Apr 23 22:16 mysql-bin.000013
-rw-r--r-- 1 tungsten tungsten  780 Apr 23 22:16 mysql-bin.index


trepctl status:
pendingError           : Event extraction failed: Relay log task has 
unexpectedly terminated; logs may not be accessible
pendingErrorCode       : NONE
pendingErrorEventId    : NONE
pendingErrorSeqno      : -1
pendingExceptionMessage: Relay log task has unexpectedly terminated; logs may 
not be accessible

Original issue reported on code.google.com by g.maxia on 23 Apr 2011 at 8:28

Single command setup for replication

Parallel MySQL slave setup currently requires two commands with a large number 
of options.  This makes initial setup rather cumbersome and difficult to scale 
effectively.  We would like to have Tungsten set up very simply in the case 
where we are using Tungsten to take over replication duties for a slave instead 
of using the normal MySQL native replication. 

Tungsten installation will be updated to include a single command setup that 
uses a single configuration file.  All commonly used options for parallel slave 
setup will be included in this file.

Original issue reported on code.google.com by [email protected] on 25 Apr 2011 at 3:28

tungsten-replicator duplicates tarball expanded directory

What steps will reproduce the problem?
1. Expand Tungsten tarball
2. Install tungsten with master-slave topology, using tungsten-installer

What is the expected output?
I expect to see the installer using the same directory from where I start the 
installation. If the directory is not under tungsten-home, the installation 
should fail.

What do you see instead?
after the installation, a new copy of the binaries is present, one where the 
installation started, and one under a subdirectory of the directory indicated 
by tungsten-home

Original issue reported on code.google.com by g.maxia on 1 May 2011 at 3:43

Replicator does not correctly assign #UNKNOWN shard when updates cross schema boundaries within a single transaction

Tungsten replicator should mark transactions as unknown if there updates to 2 
or more schemas in the transaction.  However, it appears the transactions get 
the last schema ID in the transaction and hence do not serialize.  Here is how 
to reproduce the problem: 

#!/bin/bash

mysql -h r1 -e 'create schema db1'
mysql -h r1 -e 'create schema db2'
mysql -h r1 -e 'create table db1.t1(i int not null primary key)'
mysql -h r1 -e 'create table db2.t2(i int not null primary key)'
mysql -h r1 -e 'set autocommit=0;begin;insert into db1.t1 values(1); insert 
into db2.t2 values (2); commit'

The resulting transaction looks like the following using thl list: 

SEQ# = 4 / FRAG# = 0 (last frag)
- TIME = 2011-04-08 17:47:57.0
- EVENTID = 000002:0000000000000786;54385
- SOURCEID = qa.r1.continuent.com
- STATUS = COMPLETED(2)
- SCHEMA = 
- METADATA = [mysql_server_id=10;service=alpha;shard=db2]
- TYPE = com.continuent.tungsten.replicator.event.ReplDBMSEvent
- OPTIONS = [##charset = ISO8859_1, autocommit = 1, sql_auto_is_null = 1, 
foreign_key_checks = 1, unique_checks = 1, sql_mode = '', character_set_client 
= 8, collation_connection = 8, collation_server = 8]
- SQL(0) = create table db2.t2(i int not null primary key) /* ___SERVICE___ = 
[alpha] */
SEQ# = 5 / FRAG# = 0 (last frag)
- TIME = 2011-04-08 17:47:57.0
- EVENTID = 000002:0000000000001052;54386
- SOURCEID = qa.r1.continuent.com
- STATUS = COMPLETED(2)
- SCHEMA = 
- METADATA = [mysql_server_id=10;service=alpha;shard=db2]
- TYPE = com.continuent.tungsten.replicator.event.ReplDBMSEvent
- OPTIONS = [##charset = ISO8859_1, autocommit = 1, sql_auto_is_null = 1, 
foreign_key_checks = 1, unique_checks = 1, sql_mode = '', character_set_client 
= 8, collation_connection = 8, collation_server = 8]
- SQL(0) = insert into db1.t1 values(1) /* ___SERVICE___ = [alpha] */
- OPTIONS = [##charset = ISO8859_1, autocommit = 1, sql_auto_is_null = 1, 
foreign_key_checks = 1, unique_checks = 1, sql_mode = '', character_set_client 
= 8, collation_connection = 8, collation_server = 8]
- SQL(1) = insert into db2.t2 values (2)

This appears to be a regression.

Original issue reported on code.google.com by [email protected] on 8 Apr 2011 at 4:03

tungsten-installer does not require slave-password as mandatory

What steps will reproduce the problem?
1. use tungsten-installer in --direct mode, without the --slave-password option
2. start the replicator
3. Check the logs

What is the expected output? 
I expected the installer to complain and abort the installation

What do you see instead?
The installer tried to install anyway, and failed with an unhandled exception

Original issue reported on code.google.com by g.maxia on 1 May 2011 at 5:25

missing option in configure script to set RMI port

The RMI port is set by default to 10000. The only way of changing it is by 
modifying the properties file and restarting the replicator.

This lack of option is a problem in two cases:
* when another service is using port 10000
* when you want to run more than one Tungsten replicator in the same host 
(sandboxes)

Original issue reported on code.google.com by g.maxia on 20 Apr 2011 at 1:11

missing options to set the applier in configure-service

When setting two services in the same host, we need to identify the remote 
master (it can be done now) and we need to identify the slave to which the 
service is applied.
When you use only one remote service, we use the information provided with 
./configure. But when we use two services that refer to two unrelated slaves, 
there is only one default "local" database server, while we need many.

This defect also prevents the creation of a sandbox that uses only one instance 
of Tungsten instead of requiring several ones.

Original issue reported on code.google.com by g.maxia on 20 Apr 2011 at 1:17

tungsten-installer does not check that master and slave have different sources

What steps will reproduce the problem?
1. tungsten-installer --direct --master-host=host1 --slave-host=host1


What is the expected output? 
The validator catches the error and aborts

What do you see instead?
The installer tries to install anyway, and fails with an unhandled exception.

Original issue reported on code.google.com by g.maxia on 1 May 2011 at 4:54

Use buffered I/O more effectively when extracting from MySQL binlog

Tungsten does not use buffered I/O effectively when extracting from MySQL logs. 
 This leads to very slow extract performance on slow disk subsystems (e.g., 
NFS) or in cases where there is I/O contention with other processes.  This will 
be rectified as follows. 

1. Tungsten will be upgraded to use buffered I/O when extracting.  Extraction 
should work effectively on an NFS-mounted file system and offer minimal 
contention with locally mounted disk.   

2. The I/O buffer size will be a settable parameter. 

This work is a prerequisite for reading binlog transactions directly from 
mysqld using the slave protocol without an intervening relay file on disk.

Original issue reported on code.google.com by [email protected] on 25 Apr 2011 at 2:31

missing option in tungsten-installer in --direct-mode for master-log-directory and master-log-pattern

using tungsten-installer I can't define the directory where log files are 
located in the master, and their pattern.
There is such option in --master-slave mode, but not in --direct mode

Original issue reported on code.google.com by g.maxia on 1 May 2011 at 5:04

Integrate builder in the source release

The source release should integrate required tools to build, ie builder project

Original issue reported on code.google.com by [email protected] on 7 Apr 2011 at 4:35

Integrate Hudson build number to release tarball

tarball should integrate the hudson build number to look like 
tungsten-enterprise-version-build-number.tar.gz

Original issue reported on code.google.com by [email protected] on 7 Apr 2011 at 4:34

When using disk logs, thl list can show a NullPointerException error

stephane@server1:/opt/continuent/tungsten/tungsten-replicator$ bin/thl list 
Connecting to storage2011-04-06 10:15:27,556 INFO replicator.thl.DiskLog Using 
directory '/opt/continuent/logs/' for replicator logs 
2011-04-06 10:15:27,556 INFO replicator.thl.DiskLog Checksums enabled for log 
records: true 
2011-04-06 10:15:27,560 INFO replicator.thl.DiskLog Acquired write lock; log is 
writable 
2011-04-06 10:15:27,565 INFO replicator.thl.DiskLog Loaded event serializer 
class: 
com.continuent.tungsten.enterprise.replicator.thl.serializer.ProtobufSerializer 
2011-04-06 10:15:27,567 INFO replicator.thl.LogIndex Building file index on log 
directory: /opt/continuent/logs 
2011-04-06 10:15:27,569 INFO replicator.thl.LogIndex Constructed index; total 
log files added=1 
2011-04-06 10:15:27,569 INFO replicator.thl.DiskLog Validating last log file: 
/opt/continuent/logs/thl.data.0000000001 
2011-04-06 10:15:27,571 INFO replicator.thl.DiskLog Idle log connection 
timeout: 28800000ms 
2011-04-06 10:15:27,571 INFO replicator.thl.DiskLog Log preparation is complete 
2011-04-06 10:15:27,571 INFO replicator.thl.DiskTHLStorage Adapter preparation 
is complete 
Fatal error: null 
java.lang.NullPointerException 
at 
com.continuent.tungsten.replicator.thl.THLManagerCtrl.printHeader(THLManagerCtrl
.java:307) 
at 
com.continuent.tungsten.enterprise.replicator.thl.EnterpriseTHLManagerCtrl.listE
vents(EnterpriseTHLManagerCtrl.java:207) 
at 
com.continuent.tungsten.enterprise.replicator.thl.EnterpriseTHLManagerCtrl.main(
EnterpriseTHLManagerCtrl.java:382) 


When using thl list -file <filename>, this error does not show up : 

stephane@server1:/opt/continuent/tungsten/tungsten-replicator$ bin/thl list 
-file thl.data.0000000001 
Connecting to storage2011-04-06 10:05:16,726 INFO replicator.thl.DiskLog Using 
directory '/opt/continuent/logs/' for replicator logs 
2011-04-06 10:05:16,727 INFO replicator.thl.DiskLog Checksums enabled for log 
records: true 
2011-04-06 10:05:16,731 INFO replicator.thl.DiskLog Acquired write lock; log is 
writable 
2011-04-06 10:05:16,735 INFO replicator.thl.DiskLog Loaded event serializer 
class: 
com.continuent.tungsten.enterprise.replicator.thl.serializer.ProtobufSerializer 
2011-04-06 10:05:16,738 INFO replicator.thl.LogIndex Building file index on log 
directory: /opt/continuent/logs 
2011-04-06 10:05:16,741 INFO replicator.thl.LogIndex Constructed index; total 
log files added=1 
2011-04-06 10:05:16,741 INFO replicator.thl.DiskLog Validating last log file: 
/opt/continuent/logs/thl.data.0000000001 
2011-04-06 10:05:16,743 INFO replicator.thl.DiskLog Idle log connection 
timeout: 28800000ms 
2011-04-06 10:05:16,743 INFO replicator.thl.DiskLog Log preparation is complete 
2011-04-06 10:05:16,743 INFO replicator.thl.DiskTHLStorage Adapter preparation 
is complete 
SEQ# = 8 / FRAG# = 0 (last frag) 
- TIME = 2011-04-02 04:25:54.0 
- EVENTID = 001239:0000000000002734;460 
- SOURCEID = atldb01 
- STATUS = COMPLETED(2) 
- SCHEMA = tungsten 
- TYPE = com.continuent.tungsten.replicator.event.ReplDBMSEvent 
- OPTIONS = [autocommit = 1, sql_auto_is_null = 1, foreign_key_checks = 1, 
unique_checks = 1, sql_mode = '', character_set_client = 33, 
collation_connection = 33, collation_server = 33] 
- SQL(0) = UPDATE tungsten.heartbeat SET source_tstamp= '2011-04-02 02:25:54', 
salt= 0, name= 'MASTER_ONLINE' WHERE id= 1 

This shows that the first event in the log file is event #8. 

This does not show any log corruption. Problem is related to thl tool only.

Original issue reported on code.google.com by [email protected] on 12 Apr 2011 at 8:37

Provide JSON scripting API for replicator

1. Implement & test Jolokia support
2. Extend it with secure authentication: HTTPS + HTTP Auth

Original issue reported on code.google.com by [email protected] on 20 Apr 2011 at 4:42

Blocked on: #62

THL connection URL is not set correctly when using off-board replication

When the configure-service script creates replicator .properties file, it 
incorrectly substitutes replicator host name into the replicator.store.thl.url 
parameter, while it’s supposed to place the database host ip/name there. 

Here’s an example.  Suppose the local host is replhost1 and the dbms host is 
dbhost1.  Then we see in the configuration file: 

replicator.store.thl.url=jdbc:mysql:thin://replhost1:4307/tungsten_${service.nam
e}?createDB=true

Notice that the port number is correct, but kvm02tag-dbcust is the name of the 
tungsten replicator host and not the database host.

Original issue reported on code.google.com by [email protected] on 27 Apr 2011 at 12:36

tungsten-installer creates directories without asking and without options to modify the defaults

when you use tungsten-installer, the following directories are created:
backups  configs  releases  service-logs  share

There are no options to define where such directories should go, but they are 
created directly under tungsten-home.
Instead, --*-thl-directory and --*-relay-directory have their own option.

I recommend the following:

* if --*-thl-directory and --*-relay-directory are not explicitly set, they 
should be created under tungsten-home. Currently, the installer tries to create 
them under the default path, which may not be available. For example, do the 
following
  1. make sure that /opt/continuent does not exist in both master and slave
  2. start ./tungsten-installer with a custom tungsten-home
  3. See the errors

* if --*-thl-directory and --*-relay-directory are explicitly set, the 
validator should check if they can be created. In the above example, they could 
not, because the upper directory does not exist.

(This issue should really be three issues, but since they are all related, I 
guess it's OK to fix all of them at once)

Original issue reported on code.google.com by g.maxia on 1 May 2011 at 4:14

tungsten-installer complains of missing options when invoked with --help


$ ./tools/tungsten-installer --help
ERROR >> You must specify either --direct or --master-slave
ERROR >> There are issues with the command options


When using --help, the script should not complain about missing options

Original issue reported on code.google.com by g.maxia on 1 May 2011 at 3:13

The trep_commit_seqno table has an invalid type for applied_latency

The trep_commit_seqno table uses int(11) for the applied_latency. This should 
be a float value. As a result fractional seconds on the commit latency are 
truncated.

Current MySQL table definition shown below:

mysql> describe trep_commit_seqno;
+-----------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------------+--------------+------+-----+---------+-------+
| task_id | int(11) | NO | PRI | 0 | |
| seqno | bigint(20) | YES | | NULL | |
| fragno | smallint(6) | YES | | NULL | |
| last_frag | char(1) | YES | | NULL | |
| source_id | varchar(128) | YES | | NULL | |
| epoch_number | bigint(20) | YES | | NULL | |
| eventid | varchar(128) | YES | | NULL | |
| applied_latency | int(11) | YES | | NULL | |
+-----------------+--------------+------+-----+---------+-------+
8 rows in set (0.00 sec)

Original issue reported on code.google.com by [email protected] on 14 Apr 2011 at 9:23

Replicator start in sandbox fails due to non-standard ports

I downloaded tungsten-replicator-2.0.1, unpacked the tarball, and ran 
./configure. The tungsten.cfg it produced is attached. 

I then ran this command to create a service (I'm following the Installation 
Guide):

./configure-service -C --role=master --service-type=local ditto

I then get the following output when trying to start the service:

tlittle@coolaid ~/tungsten-replicator-2.0.1 $ ./tungsten-replicator/bin/trepctl 
-service ditto start
Service started successfully: name=ditto

tlittle@coolaid ~/tungsten-replicator-2.0.1 $ ./tungsten-replicator/bin/trepctl 
-service ditto status
Processing status command...
NAME                     VALUE
----                     -----
appliedLastEventId     : NONE
appliedLastSeqno       : -1
appliedLatency         : -1.0
clusterName            : 
currentEventId         : NONE
currentTimeMillis      : 1303314542966
dataServerHost         : coolaid
extensions             : 
host                   : null
latestEpochNumber      : -1
masterConnectUri       : thl://localhost:2112/
masterListenUri        : thl://coolaid:2112/
maximumStoredSeqNo     : -1
minimumStoredSeqNo     : -1
offlineRequests        : NONE
pendingError           : Replicator unable to go online due to error
pendingErrorCode       : NONE
pendingErrorEventId    : NONE
pendingErrorSeqno      : -1
pendingExceptionMessage: Unable to prepare plugin: class 
name=com.continuent.tungsten.enterprise.replicator.thl.EnterpriseTHL
resourcePrecedence     : 99
rmiPort                : -1
role                   : master
seqnoType              : java.lang.Long
serviceName            : ditto
serviceType            : unknown
simpleServiceName      : ditto
siteName               : default
sourceId               : coolaid
state                  : OFFLINE:ERROR
timeInStateSeconds     : 6.711
uptimeSeconds          : 6.883
Finished status command...

Notice this line in particular:

 pendingExceptionMessage:  Unable to prepare plugin: class name=com.continuent.tungsten.enterprise.replicator.thl.EnterpriseTHL

I'm surprised to see anything about "EnterpriseTHL" since this is the 
open-source version, but maybe that's just a naming thing.

I may be doing something wrong here, but I was just trying to setup a simple 
master-slave replication using the "Installation & Configuration Guide" found 
here: http://www.continuent.com/downloads/documentation

Thanks,
Trevor Little

Original issue reported on code.google.com by [email protected] on 20 Apr 2011 at 3:56

Attachments:

tungsten.cfg

backup properties list hostname instead of port

In the static-servicename.properties file, the port for agent mysqldump is 
incorrectly defined.

replicator.backup.agent.mysqldump.port=${replicator.global.db.host}

it should be 'port' instead of 'host'.

Original issue reported on code.google.com by g.maxia on 21 Apr 2011 at 11:43

Clean up javadoc build warnings

Builds generate a huge number of java doc warnings.  These should be cleaned 
up.

Original issue reported on code.google.com by [email protected] on 6 Apr 2011 at 4:56

Configure script does not set up disk log

The configure script omits the question of the location of the disk log.  This 
seems to be an omission due to recent merges.  As a result users default to the 
JDBC log unless they fix the tungsten.cfg file manually.

Original issue reported on code.google.com by [email protected] on 7 Apr 2011 at 3:00

thl unable to read disk log

Following a failed switch, it became clear that the thl could not read the log. 
The log files themselves seemed OK, but trying to read with the thl resulted in 
the following error. The replicator was running at the time. 

This might have happened because the replicator was up and running and there is 
disk log logic that wants to write to the log. In general we need to make the 
thl operate in 'read-only' mode for most operations because it can also mess up 
the log if the replicator happens to be down at the time the thl runs. 

Connecting to storage2011-04-05 23:07:47,325 INFO replicator.thl.DiskLog Using 
directory '/opt/continuent/logs/' for replicator logs 
2011-04-05 23:07:47,325 INFO replicator.thl.DiskLog Checksums enabled for log 
records: true 
2011-04-05 23:07:47,329 INFO replicator.thl.DiskLog Unable to acquire write 
lock; log is read-only 
2011-04-05 23:07:47,334 INFO replicator.thl.DiskLog Loaded event serializer 
class: 
com.continuent.tungsten.enterprise.replicator.thl.serializer.ProtobufSerializer 
2011-04-05 23:07:47,383 INFO replicator.thl.LogIndex Building file index on log 
directory: /opt/continuent/logs 
2011-04-05 23:08:06,836 INFO replicator.thl.LogIndex Constructed index; total 
log files added=2245 
2011-04-05 23:08:06,836 INFO replicator.thl.DiskLog Validating last log file: 
/opt/continuent/logs/thl.data.0000002245 
2011-04-05 23:08:07,699 INFO replicator.thl.DiskLog Last log file ends on 
rotate log event: thl.data.0000002245 
2011-04-05 23:08:08,158 WARN replicator.thl.LogFile Unexpected I/O exception 
while closing log file: name=thl.data.0000002245 exception=sync failed 
com.continuent.tungsten.replicator.thl.THLException: New log file exists 
already: thl.data.0000002245 
at 
com.continuent.tungsten.enterprise.replicator.thl.DiskLog.startNewLogFile(DiskLo
g.java:940) 
at 
com.continuent.tungsten.enterprise.replicator.thl.DiskLog.prepare(DiskLog.java:3
10) 
at 
com.continuent.tungsten.enterprise.replicator.thl.DiskTHLStorage.prepare(DiskTHL
Storage.java:113) 
at 
com.continuent.tungsten.enterprise.replicator.thl.EnterpriseTHLManagerCtrl.conne
ct(EnterpriseTHLManagerCtrl.java:104) 
at 
com.continuent.tungsten.enterprise.replicator.thl.EnterpriseTHLManagerCtrl.main(
EnterpriseTHLManagerCtrl.java:356) 
min seq# = 0 
max seq# = 9223372036854775807 
events = -9223372036854775808 
highest known replicated seq# = -1 
Fatal error: null 
java.lang.NullPointerException 
at 
com.continuent.tungsten.enterprise.replicator.thl.DiskLog.release(DiskLog.java:3
99) 
at 
com.continuent.tungsten.enterprise.replicator.thl.DiskTHLStorage.release(DiskTHL
Storage.java:132) 
at 
com.continuent.tungsten.enterprise.replicator.thl.EnterpriseTHLManagerCtrl.disco
nnect(EnterpriseTHLManagerCtrl.java:135) 
at 
com.continuent.tungsten.enterprise.replicator.thl.EnterpriseTHLManagerCtrl.main(
EnterpriseTHLManagerCtrl.java:365) 
QA.M3 tungsten@qa[log]$ thl info 
Connecting to storage2011-04-05 23:45:06,443 INFO replicator.thl.DiskLog Using 
directory '/opt/continuent/logs/' for replicator logs 
2011-04-05 23:45:06,443 INFO replicator.thl.DiskLog Checksums enabled for log 
records: true 
2011-04-05 23:45:06,775 INFO replicator.thl.DiskLog Unable to acquire write 
lock; log is read-only 
2011-04-05 23:45:06,899 INFO replicator.thl.DiskLog Loaded event serializer 
class: 
com.continuent.tungsten.enterprise.replicator.thl.serializer.ProtobufSerializer 
2011-04-05 23:45:06,957 INFO replicator.thl.LogIndex Building file index on log 
directory: /opt/continuent/logs 
2011-04-05 23:45:16,936 INFO replicator.thl.LogIndex Constructed index; total 
log files added=2268 
2011-04-05 23:45:16,936 INFO replicator.thl.DiskLog Validating last log file: 
/opt/continuent/logs/thl.data.0000002268 
2011-04-05 23:45:17,991 INFO replicator.thl.DiskLog Idle log connection 
timeout: 28800000ms 
2011-04-05 23:45:17,991 INFO replicator.thl.DiskLog Log preparation is complete 
2011-04-05 23:45:17,991 INFO replicator.thl.DiskTHLStorage Adapter preparation 
is complete 
min seq# = 0 
max seq# = 147543383 
events = 147543384

Original issue reported on code.google.com by [email protected] on 18 Apr 2011 at 2:00

Float and double datatypes potentially can generate inconsistencies in MySQL row replication

This bug is migrated from http://forge.continuent.org/jira/browse/TUC-252. 

Due to conversions, it is possible to loose precision when applying a double or 
value into MySQL.  The error occurs in the last digit of decimal precision.

Original issue reported on code.google.com by [email protected] on 14 Apr 2011 at 9:05

zkfan / tungsten-replicator Goto Github PK

tungsten-replicator's People

Contributors

Watchers

tungsten-replicator's Issues

Recommend Projects

Recommend Topics

Recommend Org