Visual FoxPro Replication Server

Author: Josip Zohil, Koper, Slovenia, april 2013

Visual Fox Pro (VFP) has a built-in synchronization mechanism which refreshes data on distributed nodes/clients/servers. In this article the author shall present VFP processes added to a VFP application to create a local server with replicated data. This server is at the same time a client and a server. You can use this local server to improve VFP performance, add messaging and peer to peer capabilities to your application.

1.1 Replication

»Replication is a set of technologies for copying and distributing data and database objects from one database to another and then synchronizing between databases to maintain consistency« (SQL Server Replication). Replication servers are usually deployed to improve performance by adding parallel processing power.

When you run a VFP application on a Local Area Network (LAN) as a multiuser (client/server) system, you can look at it as a distributed system with the following characteristics:

- Computers are connected;

- The computational workload of the system is distributed between the server and the clients (nodes). Data is manipulated on the nodes (distributed computation). When you run a VFP application with smaller tables, the performance is acceptable, but the network transport and refresh of larger tables shall suffer;

- VFP synchronization service can refresh multiple data copies (buffered data) on distributed nodes (clients/workstations).

Example. Run this program on two clients/workstations of VFP application:

Set refresh to 1,-1

Use employee

Browse

Change a field’s value in the Browse window on the first commuter and then skip to another record. On the second workstation you see the changed value. Behind the scene VFP synchronize the local data on both workstations.

- There is no inter process communication between the two clients processes (except the synchronization (refresh), possible table locking and some other functionality). Also, on multi processors/core computers each client process runs as a single process (no public or similar variables pass values between processes). The VFP applications (processes) share disks, database data and no other memory.

- There is only one server (file server).

Note. A network node is a server, a workstation/client, a web service or some other service connected, such as a socket communication and a COM object.

On a LAN you run multiuser VFP applications using the client/server model. You can attach connection from other systems and connect a VFP client to other systems. The model is the same: single process execution, without the parallel exploitations of free network resources. Also, in case your application is connecting at the same time to two different VFP databases, you communicate serially with them.

There are various ways of paralleling resources in order to improve system performances. In this article I shall extend the VFP client-server system by adding it replica capabilities at the client side. A primary motivation for replicating data in a distribute computation-based solution is to cut the network latency of data access by keeping data close to the application that use it, thereby improving the response time of this application. On a workstation I am going to create a local database. It is a client that pulls the data from the VFP server (master). At the same time it is a server. It pulls the data from the master server and serves this data to other clients (Napster style: a node (peer, client) offers its files to other peers).

2 Production Problem

Let us consider a problem from the production environment. We have a large data table »dnevnal« (150 MB, 500,000 records) with customer documents. It is not normalized. Each customer document holds information about the document (number-syd, date), customers (code, name-naziv and addresses) and the products sold (code, quantity, price). It has also two fields »syd« (in it is the document’s number) and id – auto increment number. With the query »select distinct syd, naziv from dnevnal into cursor xx order by syd« (A)

we get a list of all document numbers and costumers names. We bind the query results to the combo box in which the user selects a document. When we run this query on a client computer, it executes in two minutes and more.

VFP has to transport a large amount of data over the network from the server to the client. The table “dnevnal” is large (500,000 records). When we run the query (A) on the server computer it is executed in one second, inside a virtual machine on the server computer it is executed in 30 seconds, on the wireless connected workstation in 180 seconds. Can we approach the servers’ query speed on the workstation? Can we do this using free computer resources and programming techniques?

3 Problem Solution

Our queries are executed fast on the server computer, but the slow network suffers, so the network data transport is the bottleneck: the query has to transport 150 MB and aggregate them to get a small cursor (0.5 MB). The network (transport) is our most loaded resource; it is a bottleneck, so we shall distribute some of its load to the local computer.

You can improve the query speed in various ways:

1) add better hardware;

2) normalize the table “dnevnal” and navigate with smaller normalized data tables. You can look at the table normalization as a kind of distribution, with a very limited number of distribution parts (mostly two or three). It is difficult to partition the normalized tables; they have limited possibilities to scale. Using the normalization technique you can reduce the problem, but you do not solve it: also, the normalized tables can increase to 150 MB and more;

3) use the VFP synchronization (refresh) capabilities, and thus add to VFP more distribution and parallel processes. You can distribute the data on the network, execute the queries in parallel and synchronize the distributed parts.

3.1 Adding a Local Parallel Replica Server

You can add a parallel VFP server to the workstation: a local database and a local service (process/agent) that:

- Watches the server (master) database events (insert and update triggers);

- Maintains the local data (tables and indexes) and synchronize them with the servers. It refreshes the local data periodically and/or on demand. The downloaded refreshed data are chunks of the table “dnevnal” data;

- The clients read the data from this local server (database).

The process creates a local database and synchronizes (refreshes) its data with the master server’s database. We shall use two refresh strategies:

1) Updating local data on master server’s changes (asynchronous process);

2) Updating them on demand, when the client needs them (synchronous process).

Refreshing local data means fetching chunks of masters’ and updating/inserting them in the local (replica) data base. We suppose that most data changes are at the dnevnal end (last records). We have to fetch only a set of records at the tables end. There are also the inserted records. The table has an auto increment (unique) field. For example, if we make changes in the record with the auto increment (AI) number 200, we expect there are probably changes in the records with AI greater than 200, so we fetch data from records with AI greater or equal to 200. There are no deleted records in the table.

3.1.1 Resource Table

The resource table is the infrastructure of the messages between the master server, local servers and client applications. There are three users of this table: a master server, the local services and a VFP application. In the application’s database add a table “resource” with this field: mval L, cluster L, hb L, node C(80), filePath C(80), lastid I. You will store system resources data (their states) in it.

Each node (workstation, virtual machine) has a record in this table. Its fields (columns) are:

Mval has two states: .T. –Servers’ data has changed, .F.- You have updated the local data. The client application puts this value to true (.T.), the local service puts it to false (.F.) after updating the local data.
Cluster: .T. – Servers’ data has changed, .F. – You have updated local data. The server (master) database sets this value to true (.T.), the local service puts it to false (.F.); after updating the local data.
Hb: .T. – Local service is running , .F. – It is not in the correct state. In case hb is false the application can’t use local data. The server data base puts it in state false (.F.), the local service puts it in state true (.T.). This is the service heartbeat. You can look at it as a node presence service: hb=.T. (Node service is present); hb=.F. (Not present).
Node C(80) : the workstation name/key (only one service and local database per workstation).
FilePath C(80): the path to the local database.
LastId I: the minimal id (auto increment) of the updated/inserted records.

3.1.2 Master Server Events

We add an insert and update trigger code as two stored procedures to the database table dnevnal:

* Insert Trigger:

FUNCTION inserttrig () && dnevnal.id autoincreament number

UPDATE resource SET ;

cluster = .t. , hb=.f., lastid= IIF(lastid>dnevnal.id, dnevnal.id, lastid)

RETURN .T.

* Update Trigger:

FUNCTION updatetrig ()

UPDATE resource SET cluster = .t. , hb=.f.,;

lastid= IIF(lastid>dnevnal.id, dnevnal.id, lastid)

RETURN .T.

On each SQL insert/update of the table dnevnal, a trigger event is fired (inserttrig/ updatetrig function) and it updates the table »resource« (the columns cluster and hb are set to true). VFP propagates the values in »cluster« and »hb« to all nodes. The workstation local service catches the message. I shall describe this mechanism in the next section.

3.1.3 Local Replica Server

VFP has a built-in synchronization (refresh) service so it is simple to add new distribution capabilities: a workstation (local) replica server.

Let us describe this program:

*cluster.prg

*config parameters

lc1="c:\temp\local" && path to the local dbc

?lc1

lctable="d:\moj\xxx\data\dnevnal" && or network path

lcname="dnevnal" && table name

idmax=999999999 && used to control changes no the master server (a kind of global variable)

Set Procedure To mesproc

Set Procedure To clusterproc Additive && dnevnal

Init() && initialize app

openres(lc1) && open table resource (message table)

?lctable

createloc(lctable,lc1) &&create local dbc, otherwise open it

Select Local && local table

Go Bott

m.lastLocId=Id && increament number (state parameter)

Do While .T. && loop forever, simulate a tail recursive function

*SELECT resource

checknew() &&refresh local data (catch a new message)

Select Resource

* read a message

DO case

CASE resource.stop && received a message stop

CLOSE DATABASES all

quit

case Resource.mval Or Resource.cluster && in eider case refresh local data

*start refresh (update and insert) process

Select Resource

m.lastid=Resource.lastid

initupdate(idmax) && from clusterproc

t1=Seconds()

IF used(lcname)

Select (lcname) && SELECT dnevnal , index on id

ELSE

SELECT 0

USE (lctable)

endif

Go Bott

m.incnostart=Id

?"Last inc.no. in dnevnal",m.incnostart,"Last local inc.no.:",m.lastid

t1=Seconds()

*dowload the "old" records to update the local data

* Improve the protocol: if no changes, throw away the client's update

updateloc(lctable,m.lastid,m.incnostart) && local and remote inc.no.

? "Elapsed time update:",Second()-t1

t1=Seconds()

*pull the "new" records to append to local data

insertloc(m.lastLocId) && from clusterproc

? "Elapsed time insert:",Second()-t1

sendMesDone() && update state and send (publish) a message local update is done

* Select Local

&& end of refresh (update, insert) process

endcase

Wait Window "Pres a key or S to stop." To lres Timeout 0.2

If Upper(lres)=="S"

*Return

sendMesStop() && send a message to stop

Endif

* update state (heartbeat) (publish a message: I am alive),

sendMesHb() && sended after sendmesUpdate

Enddo &&loop

Return

This program creates a local replica database (if it does not exist), otherwise it refreshes its data with the masters. It inserts in the resource table a new record with a node name/key (sys(0)). This program is also an event manager built on top of the resource table. It has to be executed in the background (not visible) (or visible in a test environment), with a refresher parameter set to 1. In the resource table it finds the workstation’s record (field node) and read the nodes’ states from it. If the records value “mval” or “cluster” is true, it downloads a chunk of data from the server and updates the local server. On the server (master) event a value (cluster=.T.) is written in the resource table, the local service is listening to this value (message) and after that pulls the data from the master.

The cluster process pulls the data in two steps. It starts, fetches from the master server the updated records and after that the inserted ones (updateloc). The update is executed on the rows that already exist in the local table. The insert is executed on the rows that don't exist in the local table (insertloc). At the end of program it writes a new state (mval=.F., cluster=.F.) in the node's resource record, waits for a second in the background and then loops forever.

At the time »window« from the start to the end of the update process, there can be changes on the master server. In such case lastid is less than idmax (resource.lastid<idmax) and we leave lastid unchanged (the process will refresh this data in the next cycle).

The procedures of the program cluster are documented:

1) *clusterproc.prg

Function createloc(lctable,lc1) && path to the local dbc

t1=Seconds()

idmax=999999999 && used to control changes no the master server (a kind of global variable)

?lctable

Select 0

Use dnevnal && cluster.exe is in the application root directory

Set Order To incno && field id - autoincrement

If Not File(lc1+".dbc") && at first start create the database

Create Database (lc1)

?lctable

Select * From (lctable) Into Table (lc1) Order By Id &&remote table

Brow

Use

Open Database (lc1)

Add Table (lc1)

Use (lc1) Excl

Index On Id Tag Id && index on autoincreament

Index On syd Tag syd && index on document number

Use

Close Databases

Endif

Select 0

Use (lc1) Alias Local

Endfunc

Function initupdate(idmax)

Select Resource

Locate For Trim(Resource.Node)==Trim(Sys(0))

Replace lastid With idmax,stop With .F. && in case of server changes this value will be changed (for local use)

ENDFUNC

Function updateloc(lctable,lclastid,lclastlocid)

lcc=Sys(2015)

*dowload the "old" records to update the local data

* Improve the protocol: if no changes, throw away the client's update

Select * From (lctable) nofilter Into Cursor (lcc) Where Id>=lclastid And Id<=lclastlocid

Go Bott

?"Number of downloaded records:", Recno()

Go Top

*update the local file with new records

Do While Not Eof()

Select (lcc)

Scatter Name m.orig

m.id=Id

Select Local

Seek m.id Order Tag Id

If Eof()

Append Blank

Else

Endif

Gather Name m.orig

Select (lcc)

Skip

Enddo

Select (lcc)

Use

Endfunc && updateloc

Function insertloc(lclastlocid)

*pull the "new" records to append to local data

lcc=Sys(2015)

Select * From dnevnal nofilter Where dnevnal.Id>=lclastlocid Into Cursor (lcc)

Select Local

Append From Dbf(lcc)

Go Bott

?"Number of total records:", Recno()

Select (lcc)

Use

Endfunc && insertloc

2) *mesproc.prg

function init

application.Visible =.t.

SET REFRESH TO 1,0.1 && you can change this parameters

*?SYS(3054,11)

SET SAFETY OFF

SET STATUS off

SET TALK off

SET MULTILOCKS ON

SET ENGINEBEHAVIOR 70

CLOSE DATABASES all

SET EXCLUSIVE off

ACTIVATE screen

***lsecno=100 && max number of possible (not catched) batch insert

ENDFUNC

function openres(lc1) && path to the local dbc

SELECT 0

USE resource

LOCATE FOR TRIM(node)==TRIM(SYS(0)) && workstations' (nodes') name

IF NOT FOUND()

INSERT INTO resource (node,mval,filepath,stop) VALUES (SYS(0),.f.,lc1,.f.) &&publish a message

ELSE

* mval=.t. - refresh local data, publish a message

UPDATE resource SET resource.mval = .t. , filepath = lc1, stop=.f. WHERE TRIM(resource.node)==TRIM(SYS(0))

ENDIF

ENDFUNC &&openres

FUNCTION checkNew()

SELECT resource

LOCATE FOR TRIM(resource.node)==TRIM(SYS(0))

RLOCK() && refresh the records' data

UNLOCK

ENDFUNC

FUNCTION sendMesDone()

SELECT resource && publish a message the update is done

*If server data was updated in this period: resource.lastid<idmax

UPDATE resource SET resource.mval = .f. , resource.lastid = IIF(resource.lastid=m.lastid,m.lastid, m.incnostart),;

resource.cluster = .f. WHERE TRIM(resource.node)==TRIM(SYS(0))

ENDFUNC &&sendMesDone

FUNCTION sendMesHb() && update the heartbeat (publish a message: I am alive)

UPDATE resource SET resource.hb = .t. WHERE TRIM(resource.node)==TRIM(SYS(0))

endfunc

FUNCTION sendMesStop() && stop the service

UPDATE resource SET resource.stop = .t. WHERE TRIM(resource.node)==TRIM(SYS(0))

endfunc

3.1.4 Resource Table and the Protocol

The presented program is a simple VFP procedure; the resource file and its management are specific. When we or a VFP statement make a change in the record, every node notices the changes and their local processes automatically refresh the local data. After reading a record in the table “resource”, VFP application can manipulate these values:

- A local service “heartbeat” (field hb) tell us whether the service is working or is down;

- The field “cluster” tell us whether the table dnevnal is updated;

- In “lastid” there is the smallest auto increment number of the set of records to be downloaded;

- When VFP application sets mval to true, it sends a message to the local service and after receiving it (reading it) in less than a second, the service refreshes the local data (pulls from the master server the possible changes).

- The value in the field node is the workstation “key”.

- There is also a path to the local data (filePath).

A node application (server/client/service) writes in the table resource a node state and/or a message. VFP propagates this data to all nodes. On every node you can see the state of all running services. You make a lock on a node’s record and you can read and write (manipulate) its state. For example, if on Node1 we look at the record with a name “Node2” and its hb field is true, the service on Node2 is running; we see also the last increment number in its local table.

Note. You have to manipulate data using SQL commands (they lock, fire events and propagate data).

3.1.5 Replica Server Application

On the computer »server« in the root application folder create a project »cluster«, add it to the program »cluster.prg«, the database table »dnevnal« and the “resource” table. After building the project, you get “cluster.exe”. Run this VFP application from the computer »Cluster1«. At first start, it downloads the data in 180 seconds on my wireless connected client (30 seconds on a virtual machine client). It starts a replica server, in loops refreshes the local data (c:\temp\local.dbc), stops executing and waits for a second. It must also manipulate messages. It runs in parallel with other workstation applications.

The computer »Cluster1« becomes a new local client/server with a local data copy (replica) (c:\temp\local.dbf) of the master server’s »dnevnal«. It connects to both databases, reads from the source, and writes to the target.

3.2 Accessing the Local Service

Let us illustrate how to use the replica server within your application and how to solve the problem posted in this article.

Suppose you have a VFP application Test with a large database table Dnevnal on the master server (computer). Add a VFP form to this application; add to it a grid with a name Grid1 and a combo box with a name Combo1. In the form’s Data Environment add a table dnevnal and in its OpenTable method add this program:

* OpenTable

SET REFRESH TO 1,1

SET EXCLUSIVE OFF

SYS(3054,11)

USE resource

LOCATE FOR TRIM(node)==TRIM(SYS(0))

IF NOT FOUND()

MESSAGEBOX("Error, Local service not found!")

return

ENDIF

RLOCK() && catch the message

unlock

IF NOT hb

MESSAGEBOX("Error, Local service not running!")

return

ENDIF

RLOCK()

unlock

IF not resource.mval

* publish a message to update the local data

UPDATE resource SET resource.mval = .t. WHERE TRIM(resource.node)==TRIM(SYS(0))

* runs in parallel with the queries in qluster.exe

Endif

This program monitors whether the local service is running (hb=.T., node service is present). It puts a mval value to true and in this way it pushes (sends) a message to the local service to refresh local data. The local service is executed in parallel with the form’s initial event. After sending a message, the form initialization continues.

In the form’s init method add this program:

*Init

SET EXCLUSIVE off

*Cluster.exe is running in parallel and is downloading the eventual changes from the server

t1=SECONDS()

lcerr=.f.

ACTIVATE screen

sele resource

LOCATE FOR TRIM(node)==TRIM(SYS(0)) && find the nodes' record

IF NOT FOUND()

MESSAGEBOX("Error, Node service not found!")

lcerr=.t.

ENDIF

RLOCK() && catch the message

unlock

IF NOT hb && hart beat if false, service is not running

MESSAGEBOX("Error, Node service not running!")

lcerr=.t.

ENDIF

lcloc=resource.filepath

IF NOT lcerr

ii=0

DO WHILE resource.mval && wait until cluster.exe refresh local data

RLOCK() &&refresh - catch the message

UNLOCK

SELECT dnevnal

WAIT TIMEOUT 0.1

*UPDATE dnevnal SET koli=koli+1 WHERE id= 812325

ii=ii+1

IF ii=100 && stop waiting, there is an error, this is a timeout

lcerr=.t.

exit

endif

ENDDO

?"Till now this program is running in parallel with Cluster.exe",SECONDS()-t1

endif

lccus=SYS(2015)

* The next query is not optimized. You can improve speed by saving aggregate data in local storage.

IF lcerr && if problems with local service, download data from server

lcloc="dnevnal" &&load balancing switch : if local server is in trouble, pull data from the master

ENDIF

?"End opening tables.",SECONDS()-t1

t1=SECONDS()

SELECT distinct syd,naziv FROM (lcloc) INTO CURSOR (lccus) ORDER BY syd

*the cursor lccus dataset is a normalized version of the original data

?"End query local data.",SECONDS()-t1

thisform.combo1.RowSource = (lccus)

thisform.combo1.Refresh ()

This program continues the process started in the form’s OpenTables event. It tests whether the service is running, after that it stops executing till a resource.mval value becomes false: the parallel program cluster stops refreshing local data. When the parallel service stops, the program continues its execution. If everything is ok, the program runs a query from a replica (local) database, or otherwise it switches the query (slow execution) to the master’s database. This program is smart: If the local server is down, it switches to the master server. The program pulls the data from two servers (the master or replica-local).

Note. Here you can insert other server load-balancing methods.

The Combo1 InteractivChange event has this program:

*InteractiveChange

lcg=SYS(2015)

ACTIVATE screen

SYS(3054,11)

t1=SECONDS()

lcd=VAL(this.Value) && documents' number

SELECT * FROM dnevnal INTO CURSOR (lcg) WHERE syd==lcd && download a document

?"Elapsed time to download a document:",SECONDS()-t1

Select (thisform.grid1.RecordSource)

use

thisform.grid1.Visible = .T.

thisform.grid1.RecordSource=(lcg)

Compile the project Test and run it from the workstation computer »cluster1«. On my wireless connected client it executes in 4 seconds, about 45 times faster (180/4). It runs in parallel with the replica server service Cluster.exe.

If you use this technique, your queries could run faster. You have to change the programs from this article:

Rename the table “dnevnal” in the program “cluster” to your table name. Adjust the name of your autoincrement field. Create indexes on the autoincrement and document number fields. (You can put these parameters in the config file);
Add a resource table to your application database;
Run the application »cluster.exe« at the computer startup (or later, but before your application). Run it on each client computer on which you need replica server capabilities;
Set enginebehaviour 70 in your application (in my experiments, the queries run faster on non-normalized data tables). If you use other enginebehaviour compatibilities, rewrite the queries in this article;

In your application queries with a smaller data set, download the data from the master server;
When you need an updatable data set, download it from the master.

4 Events

In a normal VFP application you execute code in a sequential way. There is a limited need of events.

When you execute processes in parallel, events become a precious tool. Parallel processes are workers with limited knowledge. They work together, so they need coordination (management). Events and messages can help.

Let us describe the resource table and its data as a component of an event system. We have three event sources (the master, the client and the local service) and two listeners (the client and the local service). When the update (insert) event fires the dnevnal, the update/insert trigger raise an event: put the value .T. in the field “cluster” in all records of the table resource (It sends a request to all nodes). The listeners of this event are local services (cluster.exe). They read this value (catch the master message), refresh the local servers’ data and raise a new event (end of local data update). We carry out the publishing of this event by updating the resource table record: cluster=.F., mval=.F., hb=.T. and the last local AI (increment number). Publishing (raising) this event means writing a new nodes’ (local) state in the resources’ record.

The Test application is an event publisher: it writes mval=.T. in the record, the local service (Cluster.exe), a listener (subscriber), catches this value, refreshes the local data and publishes a new state. The Test application is also a listener: it watches the field mval (in the forms init event). When mval is false (end of local data update) the form initi event continues its execution.

Using the locking system and events the VFP programs coordinate three types of parallel processes: the master, the client and the replica server.

The resource table is a container for public variables (values). All its users can manipulate this data.

5 Peer to Peer

“A peer-to-peer” (abbreviated to P2P) computer network is one in which each computer in the network can act as a client or server for the other computers in the network, allowing shared access to various resources…” (Wikipedia). The table resource and the node service (Cluster) are component of a peer-to-peer system with a central service. Each peer (node) has a server and client functionality.

A peer-to-peer system has:

Instant message service. When you write a value in the table resource, the VFP refreshes service, instantly delivers this value (message) to all local services (peers, node agents);
Presence service that shows us the current online services (nodes, peers). It helps us to determine which nodes (local services) are currently present as potential targets for delivery of an instant message.

Because the message path is always through a central node (master server), we name this a central service.

Peers (nodes, local services) have all similar capabilities; they are autonomous and can provide services to, or interact with each other. The workstations (and the master server) on which you run Cluster.exe has client and server capabilities and provided ways to control the divergence of replica contents. Your application (a client) can use these clients/servers.

6 Safety

When you run the »cluster.exe« on the workstation’s computer, you extend the network capabilities using the VFP distribution style:

· The VFP application has an important computation role in the network; it takes some load from the server and do some job on the workstation (such as running a query);

· You add it more parallel capabilities by adding it a replica server; it takes load from the server and the network. The local service runs asynchronously in parallel with your application.

You start »cluster.exe« in a fire and forget way (asynchronously). It adds to the workstation a replica server: local service, database and a table (c:\temp\local.dbf). When you run your basic application (in our case »Test.exe«), you run it in parallel with the local replica service (cluster.exe). The two data sessions run in parallel.

Attention, the two applications interact! From the parallel (concurrent) point of view this is an unsafe cooperation; maybe the second application can change (destruct) the first application’s data or vice versa.

With the application “cluster.exe” we increase the number of applications in our network. In the overall system we increase the chance of errors. Let us analyze some error possibilities:

1) The first application (cluster.exe) does not start or crashes. The second application (Test.exe) will be executed correctly and much slower. You can remove this inconvenience by restarting the first. When the Cluster application restarts it reads the minimal id (increment number) of the last updated record (to identify the potentially missing updates) and refreshes the local database.

Note. A process restart capabilities are its added value (it is state independent).

2) The first application is simple and has no statements that can affect the second application data. In a production environment the first application is not visible, so there is no user intervention in it. The second application (Test) statements can affect the first application data only using the table resource and dnevnal.

Each of the two applications has its proper data session: they share the tables “dnevnal”, “local” and “resource”. The write/read/refresh operations in these parallel data sessions obey the VFP locking and refresh rules. The programmer responsibilities are to update only the master data.

Let us analyze two totally different processes, for example:

1) Inside the Test application run the form two times. You run them sequentially: they access the value mval sequentially (one after the other).

2) Run the application Test two times on the same workstation and in each of them you run a form. The two forms run in parallel, they share the resource mval. If you change this value in one of the form, you can affect the other form execution and this can be a problem.

7 Memory Consumption

If you watch the two applications in Windows Task Manager, each amount of used memory is upper bounded with 250 MB. When it stops loading data, the RAM memory cuts down to 17 MB. The local VFP service can become critical in memory consumption only if you run in parallel multiple VFP applications with large data tables. The parallel process Cluster.exe consumes a small part of the processor and RAMs resources.

8 Larger Number of Replicated Tables

You can put more tables in a local database; theoretically all the database tables. The cost of adding smaller tables to the replica server is high, as the VFP engine has to refresh multiple tables on multiple replica servers (nodes). VFP transports smaller tables at a reasonable speed; it is better to hold them on the server. Normally a VFP application has only few large tables that are worth of being replicated locally and run the queries in parallel. You would do this judiciously as the overhead for putting many tables in replica can cut the benefits.

You can refresh many local tables in the same exe (for example, in »cluster«). In my opinion a better way is to create a separate service (similar to »cluster«) and execute in parallel a local service per local table. They can run in parallel. In case a service crash, you have to restart only this one. The processes are expensive but the most expensive is their serial execution!

9 Read Only Replica

In the experiments of this article we have used the local replica as a read only storage. Our problem has been to improve the data download. The local database can receive changes from the server, but must not accept local changes.

Theoretically a client can make changes to local data. In this case you have to solve a relatively difficult problem: how to synchronize local data changes with the server (bidirectional). For example, your local servers' data can be stale data. When you update the masters' (not stale) data, you have to resolve the conflict between the concurrent servers fresh and clients stale data.

The local replica is a backup of the table dnevnal. In our example it is at most 1 second stale. With our original query (A) we have 180 seconds stale data, in the solution with a local service only 4 seconds (1 second to pull data from the master server and 3 seconds to query local data). We reduce the chance of unseen writes in our combo box for 45 times (180/4).

10 Improvements

The code in this article is not optimized. My goal has been to prove that you can run VFP applications faster by using simple (not optimized) parallel code.

10.1 Reduce the Size of the Local Table

In our example we replicate all the data from the table dnevnal. What we really need locally is a follow_up »normalized« subset of the masters' data described by this query

»select distinct syd,naziv,id from dnevnal order by syd« (B)

(pull document number and customer name). This is a smaller data set. You can download, refresh and query this data faster. It is enough, if your replica has this data set.

The query (B) is similar to a read only VFP local view.

10.2 Watchdog Application

In our example we have watched:

- Two server resources: table change events (field cluster) and the smallest autoincrement number of the set of modified records;

- Four local resources: local data has (not) been refreshed; the highest auto increment number in the local table; the path to the local database; the service heartbeat (its presence);

You can add extra fields to the resource table and watch more local and/or server resources. For example, the network path to the local database (NP), such as »\\P2\c\temp\local«.Your local services can become the »watch dog« applications of your network resources. All peers can use this »watchdog« service. On all peers (nodes) you can catch messages from other peers: browse the resource table. You have a list of present peers. For example, on the node (peer) P1 you can see, if the local service on node (peer) P2 is present (its hb is true), if his data is updated (its cluster value is .F.), You can read its network path (NP) and pull data from the peer P2, such as »use »\\P2\c\temp\local«.

You can put in the table resource other local data: active processes, opened files, file system information ....

10.3 Work Distribution

We have a network with multiple servers, a resource table and a synchronization (refresh) process. You can use remote services on it in various ways, let us illustrate an example.

On Node2 we can start the refresh process on Node1: in the nodes’ (Node1) record of the resource table we replace mval with .T. (and skip to another record). When we put mval to true, we send a message to the node “Node1”, more precisely, we put this message in the “Node1” message queue. The service on “Node1” will read this message from its message queue and execute the process on computer “Node1”. In our example, it will start the refresh process defined in the program Cluster.

In this article I have distributed content (data), but we can also distribute the execution (computation).

Our message queue has a limited length (1). Let us illustrate how to enlarge it. Create a new table “mesqueue” (Messages queue) with this field: Sender C(80), Node C(80), Message C(256),mesId C(10), Result C(89),Done L. A new message means a new record in this table, such as this command: “Insert into mesqueue (sender, node, message, done) values (“Node2”,”Node1”,”select syd, naziv from local into table 'c:\temp\loc1.dbf' where syd=10000”, .F.). (C)

The interpretation of this command is: from node Node2 send a message to Node1. The message handler (callback function) on Node1 has to read this message, execute the command and return the path to the table with the results.

You have to modify also the program Cluster. It has to read from this table, execute the query ( C ) and write \\node1\temp\loc1.dbf in the field result. After that it has to find the next record “Node1” (the next message) and so on. The application on “Node2” reads the results with a command such as: “use \\node1\temp\loc1.dbf« and put done to true. Using the message we have »distribute work« from Node2 to Node1. On a Node2 we have waited for the result, so execution was synchronous.

The table mesqueue is a message queue and a message log.

In the last example we send a procedure to be executed on the remote node. We can look at it as a remote procedure call (RPC). In a message we put a function (select statement) to be executed on the receiving node. After executing the function, the remote node sends us the results.

You can put a larger number of messages in the message queue (table mesqueue). In the program Cluster you define the order of their execution.

If you execute the last example on a master node instead of »Node1« and send a message »select distinct syd,naziv from dnevnal into table c:\temp\loc1.dbf«, you execute this query on the computer server, it is executed fast (1 second) and you pull from it (over the wire) a small subset of data in the table loc1.dbf. Your server obtains the capabilities of a real database server: the queries are executed on it and not on the client computer. You can transport programs to the server and results from them with simple VFP commands (without direct manipulation of TCP, sockets, threads and protocols).

10.4 Other Improvements

You can improve your cluster by adjusting the refresh intervals and other parameters.

The program cluster.prg has two functions: 1) creating the replica database, 2) refreshing the replicated data. You can create more programs from it: create the replica, refresh it and execute functions (callback functions).

In the example we have used two refresh strategies. You can add a new one or remove the “orchestra style”: after the master data changes, the services start asynchronous refresh of replica tables. It is enough if you refresh the replicated data on demand, in the form's events.

When you have fresh data of node’s resources, you can build on top of it a load balancing manager. We shall discuss this matter in another article.

11 Conclusion

VFP has built-in tools you can use in adding new distribution capabilities. The local replica can improve the clients query speed (45 times). VFP makes it easy to set up the parallel replica server; you add a resource table and a simple application. You have to make some changes also to your application to make use of this replica.

The parallel process coordination is built on the top of the resource table, the events and messages. In the same way you can manage other nodes' resources and parallel processes. You can distribute resources (data and execution), manage and load-balance them.

In addition to the various techniques of improving data retrieval in VFP (indexes, normalization, query optimization) we can also include work and data distribution.