Use the spark
command line tool. In current versions of AMPS, spark is included in the AMPS distribution.
The instructions in this FAQ are useful for a one-time transfer, for example, loading sample data into a QA environment for testing. For production use, AMPS replication is the recommended approach for low-latency delivery of messages from instance to instance as described in the AMPS User's Guide.
These instructions assume that your data does not contain embedded newlines: if your data does contain newlines, we recommend writing a simple program that simply does a SOW query on one instance and publishes the results to the other instance or using the -file option in more recent versions of spark.
Copy from a SOW File
To recreate a SOW from a server that is no longer running, use amps_sow_dump
to extract the records from the SOW and use spark to publish those records into the destination AMPS instance.
$ amps_sow_dump --zip-file=records.zip /path/to/sow/file.sow $ spark publish –server new_server –topic new_topic –type message_type –file records.zip
Copying from a SOW topic in a running instance
To recreate a SOW on a server that is currently running, use the spark sow
command to retrieve records from the source topic, and use the spark publish
command to publish records to the destination topic.
$ spark sow -server old_server topic old_topic -type message_type \
| spark publish –server new_server –topic new_topic –type message_type
Notice that, because this is an operation on a running instance of AMPS, any updates that are made to the SOW topic after the sow
command runs will not be published to the new topic.
If the messages are binary or contain embedded delimiter characters, and the number of records can be easily stored on the system where the command is being run, you can use the -file
flag to store messages to a compressed file, and then republish the messages from that file.
$ spark sow -server old_server topic old_topic -type message_type -file messages.zip
$ spark publish –server new_server –topic new_topic –type message_type -file messages.zip
Copying the SOW File Directly
If both instances of AMPS have compatible SOW file versions (see the file version appendix in the AMPS User Guide), and both instances are currently shut down, it can work to copy the SOW file from one instance of AMPS to the other. This is most useful in cases where a server is migrating (so all of the data is being moved) or where the current state of the SOW file is being preserved for troubleshooting or diagnostic purposes.
The limitation of this approach is that the messages in the SOW file will not be in the transaction log of the destination server. A replay from the transaction log may not include the messages in the copied file (but may contain messages that are not in the file). Likewise, a historical SOW and subscribe could produce unexpected results. Last, but not least, if the file is damaged or removed, the destination server cannot rebuild the file from the transaction log.
It's important to be sure that both AMPS instances are shutdown when the file is copied. Copying a SOW file while AMPS is running may produce a copy that is incomplete or corrupted. Updating a SOW file while AMPS is running (for example, by overwriting the file with a copy from another server) may produce unexpected results from SOW queries, corrupt the file, or cause AMPS to exit.
For some applications, these limitations are acceptable. For other applications, the fact that the transaction log and the topic in the SOW do not contain the same information makes this approach unworkable.
Keywords: sow synchronization, move sow, copy sow, synchronize sow, move messages
Dirk, what about bringing down Server A and manually copying the files to Server B? Of course, the configs would have to be consistent, but it would work, right?
Does the amps_sow_dump needs to be run be a user with the right entitlement or running it with the application ID will do?
Moving files should work.
There are some reasons why I consider moving a file to be a little less desirable than publishing messages in the general case (for the specific case you have in mind, it may be the right thing to do):
1) You need to be using the same major/minor version of AMPS as the AMPS instance that created the source file. (Or use amps_upgrade as necessary to prepare the file for the new instance.)
2) It requires taking both the source and destination instance down -- copying a file out from under a running AMPS instance isn't guaranteed to work (for example, if there's a record partially-written at the exact point in time you do the copy).
3) If the SOW is removed again, the SOW will be out of sync again. Copying the SOW file only moves the SOW, and doesn't synchronize the SOW with the transaction log. Republishing the state directly gets to a known state that will synchronize and recover to that state at that point.
4) There's a little more possibility for error involved in moving files around, having a file system you can mount on both instances, and getting them to the right spot for the new instance, etc.
amps sow dump (omitting underscores to work around comment autoformatting) operates on the SOW file, so the AMPS entitlement system isn't involved.
The user that runs amps sow dump needs to have read permission to the SOW file.
The user ID that runs "spark publish" would need the right entitlement?
If so is there a spark Window version? Can the user use the client api to publisher the data from ampssowdump ?
Yes, the user that runs spark publish must have permission to publish to the destination instance.
The spark included with AMPS 4.0 and above is a Java application, and should run fine on Windows.
If your installation uses authentication that requires an Authenticator (for example, it uses challenge-response), you will also need to provide spark with an Authenticator that implements that authentication (basically, the same thing an AMPS Java client application needs to have to authenticate to AMPS).
And, yes, you could also use one of the AMPS client APIs to publish the data from the dump to AMPS, and that will also work just fine.